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ABSTRACT 


Myotoxin  a,  a  42-residue  protein  from  the  venom  of  the  prairie  rattlesnake 
{Crotalus  viridis  viridis),  has  been  studied  in  aqueous  solution  by  proton  nuclear 
magnetic  resonance  (NMR)  spectroscopy,  and  a  general  tertiary  structure  has 
been  determined.  Myotoxin  a  is  one  of  a  family  of  highly  homologous  myotoxins 
that  cause  localized  tissue  myonecrosis  upon  envenomation  and  whose 
structures  are  highly  constrained  by  three  disulfide  linkages.  Eighty-six  relevant 
distance  constraints  derived  from  nuclear  Overhauser  enhancement 
spectroscopy  (NOESY)  experiments  were  employed  in  distance  geometry 
calculations.  A  superimposed  subset  of  the  best  refined  structures  yielded  a 
medium  resolution  (backbone  atoms'  root  mean  square  distance  of  2.5  A)  tertiary 
conformation.  The  structure  consists  of  three  strands  of  anti-parallel  beta  sheet 
bound  by  three  disulfide  bonds  and  connected  by  short  loops  and  turns, 
including  a  modified  type  VI  (c/s-proline)  turn.  The  N-terminal  region  is  not  well 
defined  due  to  a  paucity  of  constraints.  Myotoxin  a  exists  as  an  equilibrium 
mixture  of  two  forms  in  a  4:1  ratio,  as  evidenced  by  reverse-phase  high 
performance  liquid  chromatography.  Additionally,  each  form  contains  a  small 
amount  of  what  appears  to  be  myotoxin  C.  v.  viridis-2.  Equilibrium  of  both  forms 
is  established  within  one  hour  from  a  single,  isolated  form  at  25°C,  but  isolation 
at  2°C  reduces  the  rate  of  interconversion.  The  existence  of  both  chemical  and 
conformational  heterogeneity  has  produced  complex  NMR  spectra  with  many 
peaks  that  can  not  be  unambiguously  assigned.  These  ambiguities  have  limited 
the  number  of  distance  constraints  obtained  and  precluded  the  determination  of 
a  more  highly  defined  tertiary  structure.  Purification  of  myotoxin  a  by  affinity 
chromatography  and  low  temperature  separation  of  confomners  should  greatly 
facilitate  generation  of  well-refined,  highly-converged,  accurate  structures. 
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CHAPTER  I 
INTRODUCTION 
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The  prairie  rattlesnake  {Crotalus  viridis  viridis),  a  subspecies  of  the 
western  rattlesnake,  inhabits  a  region  of  the  central  United  States  from  western 
Montana  eastward  through  the  Dakotas,  south  through  western  Texas  and  into 
northeastern  Arizona.  This  habitat  extends  slightly  into  Canada  and  Mexico  but 
does  not  include  much  of  the  agricultural  regions  of  Nebraska  and  Kansas.  The 
presence  of  C.  v.  viridis  in  Arizona  has  been  attributed  to  Indians  bringing  these 
snakes  to  Hopi  villages  for  snake  dances  (Klauber,1982). 

C.  V.  viridis  may  grow  to  a  length  of  about  1 .75  meters  and  can  strike  to  a 
distance  half  its  length  at  over  3  meters  per  second  (Klauber,1982),  envenoming 
the  target.  The  snake,  when  milked,  will  yield  approximately  44  mg  (dr^  weight) 
or  more  of  venom.  With  its  venom's  intravenous  toxicity  (LD50  in  mice,  mg/kg) 
of  1 .0  -1 .6  and  intraperitoneal  toxicity  of  2.0-2.3,  the  adult  prairie  rattlesnake 
holds  over  a  thousand  lethal  doses  (20g  mice)  in  its  venom  glands  (Glenn  & 
Straight,  1982). 

An  envenomed  adult  human  may  experience  localized  stinging, 
numbness,  tingling  at  the  extremities,  nausea,  localized  swelling  and 
discoloration,  extreme  pain,  faintness  and  coma.  A  child  victim  may  additionally 
exhibit  hypertonic  muscles  and  convulsions  (Klauber,1982). 

Myotoxins.  While  the  venom  of  C.  v.  viridis  contains  many  components 
that  elicit  various  systemic  and  localized  responses  in  victims,  the  focus  of  this 
study  is  on  myotoxin  a,  a  small,  42-residue  protein  that  induces  myonecrosis 
upon  envenomation.  The  biological  purpose  has  been  suggested  as  the 
limitation  of  flight  of  prey  and  promotion  of  death  caused  by  the  paralysis  of  the 
limbs  and  diaphragm,  respectively  (Ownby  et  al.,  1988;  Griffin  &  Aird,  1990).  In 
humans,  the  myonecrosis  can  cause  permanent  damage,  leading  to  the  loss  of 
extremities  (Tu,  1991).  Myotoxin  a  is  one  of  two  myotoxic  components  in  the 
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venom  of  C.  v.  viridis.  The  other,  viriditoxin,  is  a  high  molecular  weight  protein 
that  exhibits  myotoxic  effects  secondary  to  its  hemorrhagic  effects  (Gleason  et 
al.,  1983).  In  contrast,  myotoxin  a  is  of  small  molecular  mass  {Mf  4824)  (Griffin 
&  Aird,  1990)  and  contains  only  a  non-enzymatic  myotoxic  activity  (Mebs  & 
Ownby,  1990). 

Light  microscope  studies  of  mouse  skeletal  muscle  tissue  after  i.m. 
injection  of  myotoxin  a  revealed  vacuolation  attributed  to  enlargement  of  the 
sarcoplasmic  reticulum  vesicles  within  3  hours.  By  12  hours,  examination 
showed  not  only  a  continued  dilation  of  the  sarcoplasmic  reticulum  but  also  a 
swelling  of  the  perinuclear  space.  By  48  hours,  the  highly  enlarged  sarcoplasmic 
reticulum  had  degraded  into  several  smaller  vesicles,  mitochondria  had 
enlarged,  and  the  myofibrils  had  begun  to  disintegrate.  After  72  hours,  the 
myofilaments  were  completely  disorganized  and  the  cells  were  necrotic  (Ownby, 
1982;  Mebs  &  Ownby,  1990). 

A  mode  of  action  for  myotoxin  a  has  been  suggested  as  an  inhibition  of 
the  Na+/K+  ATPase,  causing  an  influx  of  Na+  with  its  solvating  water  swelling  the 
sarcoplasmic  reticulum  and  then  the  entire  cell  until  fatally  disrupted  (Ownby, 
1982).  Electrophysiological  investigations  of  myotoxin  a  on  mouse  and  rat 
skeletal  muscles  revealed  a  reduced  (-80mV  to  -60mV)  resting  membrane 
potential  that  was  reversed  by  tetrodotoxin  (Na+  channel  inhibitor)  or  low  [Na+], 
enhanced  by  ouabain  (Na+-K+  ATPase  inhibitor)  or  low  [CI'J,  and  unaffected  by 
[K+].  These  findings  suggest  that  the  direct  target  is  the  sarcolemma's  Na+ 
channel,  with  myotoxin  a  serving  to  Increase  Na+  permeability  (Hong  &  Chang, 
1985).  Incubation  of  frozen,  sectioned  human  muscle  tissue  with  horseradish 
peroxidase-conjugated  myotoxin  a  showed  binding  to  the  sarcoplasmic  reticulum 
rather  than  to  the  sarcolemma  (Tu  &  Morita,  1983). 
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Myotoxin  a  and  its  N-  and  C-terminal  fragments  have  been  shown  to  bind 
to  Ca'*"*’  ATPase  and  prevent  Ca"*^  uptake  in  isolated  sarcoplasmic  reticulum 
vesicles  (Baker  et  al.,  1992;  Utaisincharoen  et  al.,  1991),  suggesting  that 
myotoxin  a  may  act  as  an  inhibitor  of  the  Ca"*^  ATPase  of  the  sarcoplasmic 
reticulum  membrane.  However,  Engle  et  al.  (1983)  found  no  change  in 
sarcoplasmic  reticulum  vesicles’  Ca"*^  uptake  or  release  when  treated  with 
myotoxin  II  from  C.  v.  concolor,  a  highly  homologous  myotoxin  with  similar 
histological  effects  (Ownby  et  al.,  1988). 

Previous  Structural  Studies.  Myotoxin  a  belongs  to  a  unique,  yet  highly 
homologous  family  of  proteins  whose  members  are  all  small  myotoxins  from 
snake  venom  (Bieber  et  al.,  1987).  These  members  include  myotoxins  from  C. 
durissus  terrificus,  C.  adamanteus,  C.  scutulatus  scutulatus,  C.  v.  concolor,  C.  v. 
helleri,  and  several  forms  from  C.  v.  viridis,  Including  myotoxin  a,  that  exhibit 
sequence  microheterogeneity  from  one  another  (see  Figure  1).  Many  residues 
are  completely  conserved  and  substitutions  are  limited  to  a  few,  often 
conservative,  substitutions. 

The  cysteine  residues  are  almost  completely  conserved.  Alternative 
disulfide  bond  arrangements  have  been  reported  for  crotamine,  including  C4- 
C37  /  Cl  1-C36  /  C18-C30  (Conti  &  Laure,  1988)  and  interchain  disulfide  links  to 
form  homodimers  and,  perhaps,  polymers  up  to  a  hexamer  (Teno  et  al.,  1990). 
Nevertheless,  light  microscope  studies  of  myotoxin  a  and  crotamine  on  mouse 
skeletal  muscle  cells  have  shown  cellular  damage  of  the  same  histology 
(Cameron  &  Tu,  1978).  Recent  unpublished  results  from  Bieber  et  al.  confimi 
the  disulfide  arrangement  of  myotoxin  a  as  reported  by  Fox  et  al.  (1979)  and 
show  no  evidence  for  covalent  dimerization  of  myotoxin  a. 
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Schmidt  are  those  encoded  by  cDNA  and  represent  the  microheterogeneity  found  in  directly  sequenced  crotamine  (K6I; 
R31P;  W34R).  Additionally,  Aird  et  al.  (1991)  cite  a  personal  communication  reference  to  a  highly  homologous, 
42-residue  myotoxin,  toxin  E,  from  C.  h.  horridus.  Bold  letters  Indicate  residues  not  in  concensus  with  myotoxin  a. 
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FIGURE  2:  Disulfide  bond  arrangement  of  myotoxin  a  (Fox  et  al.,  1979). 

Initial  circular  dichroism  (CD)  studies  suggest  that  myotoxin  a  degrades  to 
a  random  coil  structure  upon  reduction  and  alkylation  of  the  disulfide  bonds. 

Light  microscope  studies  revealed  a  loss  of  biological  activity  with  this  chemically 
modified  form,  leading  to  the  conclusion  that  the  disulfide  bonds  were  necessary 
for  myotoxin  a's  toxicity  (Cameron  &  Tu,  1977).  The  disulfide  bridges  alone 
appear  to  impart  great  conformational  constraints  on  the  native  structure.  Note, 
however,  that  Baker  et  al.  (1992)  showed  an  equal  inhibition  of  Ca++  uptake  in 
isolated  sarcoplasmic  reticulum  vesicles  from  N-  and  C-terminal  peptide 
fragments  containing  no  disulfide  bonds  as  from  native  myotoxin  a.  The  link 
between  interactions  of  myotoxin  a  on  Ca++  ATPase  and  observed  myonecrosis 
has  not  been  firmly  established. 

Predictive  and  experimental  methods  led  to  the  conclusion  that  the 
secondary  structL  u  of  myotoxin  a  appeared  to  be  mostly  p-sheet  with  little  or  no 
a-helix.  Bailey  et  al.  (1979)  employed  a  modified  Chou-Fasman  secondary 
structure  prediction  technique  to  yield  figures  of  14%  a-helix  /  57%  p-sheet  and 
64%  a-helix  /  47%  p-sheet  (overlapping)  when  using  original  and  revised 
parameters,  respectively.  CD  spectroscopy  indicated  no  a-helix  but  instead 
indicated  p-sheet  and  p-tums.  Laser  Raman  analysis  gave  results  of  no  a-helix, 
73%  p-sheet,  and  27%  random  coil. 

Using  a  predictive  method  based  on  hydrophobicity,  Henderson  and 
Bieber  (1987)  suggested  the  presence  of  a  14-residue  N-terminal  a-helix  on  a 
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stmcture  otherwise  composed  of  p-sheet.  The  pH-dependent  ’H-NMR  shifts  of 
Y1 ,  H5,  and  H10  suggested  that  these  residues  are  in  close  proximity  of  each 
other,  as  in  a  N-terminal  helix.  Furthermore,  the  pH  titration  shifts  suggested  that 
the  protonation  of  one  or  both  histidine  residues  (H5  &  H10)  causes  the  N- 
terminal  region  to  become  exposed  to  solvent,  which  would  be  consistent  with  a 
helix  (at  physiological  pH)  to  random  coil  (at  low  pH)  transition.  The  side  chain  of 
Y1  showed  a  coupling  pattern  that  suggested  free  ring  rotation  (Henderson  et  al., 
1987).  It  is  interesting  to  note  that  this  Y1  is  completely  conserved  among  all 
members  of  this  family  of  myotoxins  and  Is  essential  for  the  activity  of  myotoxin  a 
(Hayes,  1984). 

Henderson  (1986)  completed  the  assignment  of  the  aromatic  residue  side 
chains  as  well  as  that  of  three  singly  occurring  residues  (L25,  M28  &  R31)  using 
one-  and  two-dimensional  techniques.  Using  400  MHz  ’H-NMR  two-dimensional 
spectra  (COSY,  NOESY  and  RELAY),  Murchison  (1989)  was  able  to  make 
sequence-specific  assignments  for  the  NH,  C«H,  and  CPH  peaks  of  ca.  50%  of 
the  residues  in  myotoxin  a.  The  sequential  assignment  technique  (Billeter  et  al., 
1982;  Wuthrich,  1986)  led  to  the  successful  assignment  by  Murchison  of  only  the 
region  R31-K38  .  The  amino  acid  types  could  only  be  discerned  as  being  G 
(AX),  AMX  or  long-chain  spin  systems.  The  use  of  a  main  chain  directed  search 
algorithm  (Englander  &  Wand,  1987)  found  NOEs  indicative  of  anti-parallel  B- 
sheet  between  the  regions  4-5  /  30-38  / 17-18.  This  MCD  method  also 
suggested  a  type  II  turn  at  residues  K7-G8-G9  but  found  no  evidence  for  a-helix 
anywhere  in  the  structure. 

Complexity  of  Spectra.  Both  Henderson  (1986)  and  Murchison  (1989) 
found  the  anomaly  of  too  many  peaks  in  the  NMR  spectra  for  the  number  of 
protons  in  a  single  form  of  myotoxin  a.  These  excess  peaks  led  to  ambiguities 
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that  prevented  Murchison  from  assigning  Cl  1-117  and  other  regions.  Despite  the 
complexity  of  spectra,  subsequent  work  by  Murchison  and  Nieman  (unpub.) 
using  500MHz  'H-NMR  2D  experiments  led  to  the  assignment  of  peaks  to  nearly 
all  protons  in  myotoxin  a.  Ambiguities  and  a  lack  of  assignments  still  existed  in 
the  N-terminal  region,  which  may  not  be  structurally  well  defined  under  the 
conditions  studied.  Other  incomplete  assignments  existed  for  some  side  chains, 
and  many  peaks  in  the  spectra  remained  unassigned.  No  definitive  attempt  had 
been  made  to  model  an  initial  set  of  constraints  from  NOESY  peak  volumes  into 
three-dimensional  structures. 

The  heterogeneity  of  the  myotoxin  a  preparation  resulting  in  the 
complexity  of  NMR  spectra  has  also  manifested  itself  as  twin  peaks  on  RP-HPLC 
separations  (Murchison,  1989).  The  explanation  for  the  excess  NMR  peaks  has 
been  suggested  as  a  similar  protein  contaminant  that  co-purified  with  myotoxin  a 
(Henderson,  1986;  Murchison,  1989),  aggregation  (Henderson,  1986),  and/or 
isomerization,  to  include  various  possible  disulfide  bond  arrangements  or  proline 
cis-trans  isomerization  at  one  of  the  three  proline  residues  (Murchison,  1989). 

Further  investigation  of  the  heterogeneity  by  Misra  (1991)  revealed  that 
the  myotoxin  a  preparation  gave  two  peaks  on  RP-HPLC  in  a  ca.  4:1  ratio  of 
areas,  with  the  minor  peak  eluting  first.  Injection  of  either  peak,  after  drying  and 
redissolving  at  ambient  temperature,  resulted  in  the  appearance  of  both  peaks  in 
about  the  same  ratio  of  areas.  Carboxypeptidase  Y  treatment  of  each  original 
peak's  fraction  to  remove  the  C-terminal  residue  and  subsequent  amino  acid 
analysis  of  those  residues  revealed  the  presence  of  glycine  and  alanine  in  a  ca. 
4:1  ratio  in  each  peak.  The  glycine  would  correspond  to  G42  of  myotoxin  a,  and 
the  alanine  would  likely  correspond  to  A45  of  viridis-3  (Griffin  &  Aird,  1990).  The 
implication  is  that  both  isomerization  and  chemical  microheteroger  .aity  are 
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present  in  the  myotoxin  a  preparations  and  are  responsible  for  the  additional 
peaks  present  in  NMR  spectra. 

Myotoxin  a  provides  a  suitable  subject  for  study  by  NMR  as  it  is  small, 
highly  soluble,  available  in  sufficient  amounts,  and  has  not  been  reported  as 
crystallized  for  x-ray  studies.  Though  pharmacological  research  on  myotoxin  a 
has  not  been  prevalent  (Stocker,  1990),  much  work  has  been  done  toward 
neutralizing  myotoxic  activity  (Menez,  1991)  and  determining  its  site  of  action 
(Tu,  1991).  Myotoxin  a  presents  an  intriguing  problem  in  solving  its  structure- 
function  relationships  since  neither  have  yet  been  well  defined.  The  complexity 
of  spectra  makes  this  process  a  challenge. 


CHAPTER  II 

MATERIALS  AND  METHODS 


11 

Purification.  Myotoxin  a  was  purified  from  Crotalus  viridis  viridis  venom 
(Laser  Lab,  Salt  Lake  City)  using  essentially  the  method  of  Henderson  (1986). 
One  gram  of  dry  crude  venom  was  dissolved  in  5  ml  of  0.1  M  KCI/0.05M  Tris 
buffer  (pH  9.0)  and  centrifuged  (Sorvall  RC-5B  Refrigerated  Superspeed 
Centrifuge)  5  min  at  1 000  rpm  at  4°C.  The  pellet  was  resuspended  in  5  ml  of  the 
same  buffer  and  the  suspension  was  again  centrifuged  under  the  same 
conditions.  The  collected  supernatant  solutions  from  both  centrifugations  were 
pooled.  This  sample  was  run  (Buchler  peristaltic  pump)  onto  a  Fractogel  TSK 
CM650S  (EM  Sciences)  carboxymethyl  cellulose  cation  exchange  column  (2.5  x 
37.5  cm;  Pharmacia)  at  2°C  in  0.1  M  KCI/0.05M  Tris  buffer  (pH  9.0)  at  a  flow  rate 
of  1 .5  ml/min  with  a  salt  gradient  from  0.1  to  1 M  KCI.  Absorbance  at  280  nm 
was  measured  on  an  Instrumentation  Specialties  Co.  (ISCO)  Type  6  Optical  Unit, 
amplified  by  an  ISCO  Model  1133  Multiplexer-Expander,  and  chart  recorded  on 
an  ISCO  Model  UA-5  Absoibance/Fluorescence  Monitor.  Fractions  were 
collected  automatically  (4  min/tube)  on  a  Gibson  FC-100  Micro  Fractionator. 

Most  proteins  came  off  in  the  first  major  peak,  but  myotoxin  a,  with  a  higher  pi, 
eluted  much  later,  in  the  second  major  peak.  The  fraction  containing  myotoxin  a 
was  ultrafiltered  in  a  43  mm  Amicon  Model  52  concentrator  using  a  Diaflo  YM2 
membrane  filter  (Amicon)  at  2°C  with  55psi  Ng  through  several 
dilution/concentration  cycles.  The  sample  was  lyophilized  (Virtis  Freeze  Dryer) 
and  kept  desiccated  at  -20°C  until  used. 

NMR  Sample  Preparation.  Lyophilized  myotoxin  a  was  dissolved  to  ca.  3- 
4  mM  in  85%  HgO/l  5%  DgO  with  a  small  amount  of  3-(trimethyl)propionic- 
2,2,3,3,-d4  acid  (TSP)  to  a  total  volume  of  ca.  0.8  ml.  A  typical  sample  was 
prepared  by  dissolving  15.3  mg  myotoxin  a  in  0.86  ml  85%  H20/15%  DgO 
(Aldrich).  To  this  solution  was  added  50  pi  of  0.075%  (w/w)  TSP  (Aldrich)  in  90% 
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H20/10%  DjO.  The  pH  was  monitored  on  a  Radiometer  Copenhagen  PHM  84 
Research  pH  meter  with  a  Lazar  PHR-146  Micro  Combination  electrode  and 
adjusted  to  3.5  with  1 M  and  0.1  M  HCI.  A  50  pi  aliquot  was  diluted  into  950  pi 
H2O,  and  its  absorbance  at  280  nm  was  measured  on  a  Varian  DMS  100S  UV 
Visible  Spectrophotometer.  An  extinction  coefficient,  £280=2.27  mg-iml  cm-’ 

(Allen  et  al.,  1986)  was  employed  to  calculate  the  sample  concentration. 

NMR  Acquisition  Parameters.  ’H-NMR  spectra  were  acquired  on  a  Varian 
Unity  500  spectrometer  controlled  by  a  Sun  4/260  workstation  running  Varian's 
VNMR  software.  The  spectrometer  was  operated  at  499.843  MHz  with  a  sweep 
width  of  6199.6  Hz.  Unless  otherwise  stated,  data  was  acquired  at  25°C.  Orte- 
dimensional  spectra  (64  transients)  were  acquired  with  32  K  (32,768)  points  and 
referenced  to  TSP.  Low-power  continuous  pre-irradiation  was  employed  to 
suppress  the  water  resonance.  Double  quantum  filtered  correlated  spectroscopy 
(DQFCOSY)  (Ranee  et  al.,  1983)  experiments  were  normally  acquired  with  4  K 
points,  48  scans,  and  800  increments.  Total  correlation  spectroscopy  (TOCSY) 
(Bax  &  Davis,  1 985)  spectra  were  normally  acquired  with  2  K  points,  32  scans, 
and  600  increments  with  various  mixing  times  from  20  to  100  ms.  Nuclear 
Overhauser  enhancement  spectroscopy  (NOESY)  (Kumar  et  al.,  1980) 
experiments  were  usually  acquired  with  2  K  points,  96  scans,  and  512 
increments  with  various  mixing  times  from  50  to  350  ms.  Spectra  were  acquired 
with  phase  sensitive  detection  using  the  hypercomplex  method  (States  et  al., 
1982). 

In  a  software  variation  of  the  above  experiments,  both  the  decoupler  and 
transmitter  signals  are  synthesized  from  the  transmitter  board,  rather  than 
separately  from  the  decoupler  and  transmitter  boards.  This  modification  proved 
to  effectively  suppress  the  water  signal  and  allowed  a  greater  sensitivity  closer  to 
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the  water  resonance,  due  to  greater  phase  coherence  between  presaturation 
and  observe  transmissions.  These  experiments  have  sometimes  been  referred 
to  as  "transmitter-NOESY,"  “transmitter-TOCSY.“  etc.  (TNNOESY,  TNTOCSY, 
etc.)  experiments.  However,  the  pulse  sequences  are  identical  to  those  of  the 
original  experiments. 

Spectral  Assignment  Strategies.  Spectra  were  initially  Fourier 
transformed  on  the  Sun  4/260  workstation  under  VNMR.  Parameters  and  data 
were  passed  to  a  Silicon  Graphics  Indigo  R3000  workstation  and  subsequently 
converted  into  the  format  of  Felix  2.05  (Hare  Research,  Inc.).  The  spectra  were 
Fourier  transformed  in  Felix  using  a  sine  square  windowing  function  shifted  id3 
to  n/6  and  were  zero  filled  to  a  typical  digital  resolution  of  3.02  Hz/point  in  F2  and 
3.02  Hz/point  in  FI .  The  resultant  2  K  x  2  K  real  matrices  were  inspected,  and  a 
suitable  cutoff  threshold  was  selected  that  minimized  signal  loss  while  reducing 
noise  to  a  workable  level.  The  size  of  the  matrices  were  reduced  by  saving  only 
data  points  above  the  threshold  values.  The  "squeezed"  matrices  resulted  in  a 
large  file  space  savings  while  maintaining  nearly  all  usable  information.  These 
matrices  were  used  to  make  spectral  assignments  and  distance  constraints  in 
Felix,  running  on  the  Indigo  or  a  Silicon  Graphics  Iris  4D/80GT  workstation. 

The  general  strategy  for  making  spectral  assignments  followed  Wuthrich 
(1986).  First,  the  chemical  shifts  and  dimensions  of  all  peaks  in  the  fingerprint 
(NH-C“H)  region  (ca.  3.5-6.5  ppm  in  one  dimension  and  6.5-1 1  ppm  in  the  other 
dimension)  were  entered  into  a  Felix  database.  Peak  centers  and  widths  were 
determined  interactively  using  the  Felix  graphical  interface.  Each  crosspeak 
resulting  from  the  scalar  coupling  of  the  NH  and  C“H  protons  within  the  same 
residue  was  given  a  unique  numerical  assignment.  For  example,  the  database 
entry  for  one  such  COSY  fingerprint  peak  might  show  nh23  as  the  assignment  in 
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one  dimension  and  ca23  as  the  assignment  in  the  other  dimension.  The 
corresponding  crosspeaks  on  the  opposite  side  of  the  diagonal  in  the  COSY 
spectra  received  the  same  arbitrary  numerical  assignment. 

Second,  peaks  were  picked  in  the  C“H*CPH  region  (ca.  2.0-3.5  ppm  in  one 
dimension  and  3.5-6.0  ppm  in  the  other  dimension).  The  assignments  of  peaks 
in  this  region  were  then  brothered  across  the  diagonal  to  corresponding  NH-C'+I 
peaks  in  the  fingerprint  region  (see  Figure  3).  Peaks  in  the  spectra  that  has 
similar  chemical  shifts  along  the  C“H  resonance  axis  were  displayed  in 
expanded,  aligned  tiles.  Peaks  which  were  unambiguously  aligned  with  the  NH- 
C“H  peak  were  assigned  as  C“H-CPH  peaks  bearing  the  same  arbitrary 
numerical  assignment.  In  cases  where  more  than  one  NH-C“H  peak  or  more 
than  two  C“H-CPH  peaks  fell  along  the  same  line,  an  unambiguous  brothering 
could  not  be  made.  The  C“H-CPH  peaks  in  such  a  case  would  remain 
unassigned  for  the  time  being. 

Third,  all  peaks  in  the  TOCSY  spectrum  of  a  medium  mixing  time  (60-80 
ms)  were  picked  into  a  separate  database.  The  NH-C“H  peaks  in  the  TCX^SY 
spectrum  were  identified  by  the  peaks  in  the  corresponding  locations  on  the 
COSY  spectrum  and  given  the  same  arbitrary  assignments.  Side  chain  TOCSY 
peaks  with  the  same  NH  chemical  shift  were  brothered  region  from  6.5-1 1  ppm 
in  one  dimension  and  0.5-6.0  ppm  in  the  other  dimension  (referred  to  hereafter 
as  the  NH-CSH  region).  Whenever  possible,  C«H-CPH  peak  assignments  from 
the  COSY  spectrum  would  be  transferred  to  NH-CPH  peaks  in  the  TOCSY 
spectrum  with  the  same  C^H  shifts  and  which  were  brothered  to  the  same  C“H 
shift.  The  remaining  TOCSY  peaks  along  the  NH  resonance  were  initially 
assigned  as  C?  and  the  arbitrary  number  of  the  residue. 
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FIGURE  3:  Schematic  diagram  of  the  connectivity  between  NH-C“H,  C«H-CPH, 
and  CPH-CifH  peaks  in  a  COSY  spectrum  for  a  glutamate  residue.  Another 
symmetry-related  pattern  exists  across  the  diagonal  (not  shown). 
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Fourth,  the  pattern  of  peaks  from  each  given  NH  shift  in  the  NH-C^H 
region  was  followed  to  where  it  was  repeated  at  the  C“H,  C^H,  and  other  side 
chain  proton  shifts  (see  Figure  4).  The  arbitrary  assignments  were 
correspondingly  brothered.  This  process  was  done  in  an  iterative  manner  along 
with  picking  and  identifying  the  corresponding  side  chain  peaks  (past  C^Hs)  in 
the  COSY  spectrum  and  transferring  the  TOCSY  assignments  back  to  these 
peaks.  The  iterative  nature  of  the  process  allowed  resolution  of  some 
ambiguities  and  identification  of  some  of  the  outer  side  chain  protons  as  OH, 
C^H,  etc.  Additionally,  use  was  made  of  the  TOCSY  spectra  at  different  (usually 
1 00  ms)  mixing  times  to  better  reveal  particular  peaks  in  certain  regions  of  the 
spectrum.  Brothered  side  chains  were  identified  as  either  G  (AX),  AMX,  L,  R,  I, 

S  (AMX  with  CPH  peaks  downfield  of  3.5  ppm),  P  (absent  in  the  NH-C^H  region; 
identified  last  after  assigning  most  of  the  C“H-C^H  region,  0.5  to  6.0  ppm  in  both 
dimensions),  or  long  chain  (Wiithrich,  1986). 

Fifth,  peaks  in  the  aromatic  region  (ca.  6.0-8.0  ppm  in  both  dimensions)  of 
the  COSY  were  picked.  Assignments  of  types  of  residues  were  made  based 
upon  the  characteristic  spin  coupling  patterns.  In  the  case  of  aromatic  side  chain 
peaks  that  could  be  unambiguously  assigned,  these  assignments  were  then 
transferred  to  the  corresponding  peaks  in  the  TOCSY.  Where  possible,  the 
aromatic  side  chains  were  brothered  to  their  corresponding  C“H  peaks  in  the 
TOCSY  spectrum. 

Sixth,  all  peaks  in  the  NOESY  spectrum  of  a  medium  mixing  time  (100 
ms)  were  picked  without  any  initial  assignments.  Assignments  were  transferred 
from  the  corresponding  TOCSY  peaks.  The  TOCSY  crosspeaks  represent  all 
the  intraresidue  interactions.  The  NOESY  crosspeaks  represent  all  through- 
space  interactions  of  protons  within  ca.  5  A  of  each  other.  Therefore,  after 
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FIGURE  4:  Schematic  diagram  of  the  recurring  connectivity  in  a  TOCSY 
spectrum  for  a  glutamate  residue.  For  long  side  chains,  not  all  peaks  are  visible 
at  each  resonance,  depending  on  mixing  time. 
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transferring  the  assignments  from  the  TOCSY  to  the  NOESY  spectrum,  the 
remaining  unassigned  NOESY  crosspeaks  represented  interresidue  interactions. 
Wuthrich  (1986)  has  shown  that  for  allowable  <|),\|/  angles,  NHj-C«Hi  (intraresidue) 
crosspeaks  and  C“H|-NHj^i  (sequential,  interresidue)  crosspeaks  will  occur  in  the 
NOESY  spectrum  (given  the  proper  acquisition  parameters).  The  sequential 
connectivity  of  the  residues  in  the  spectrum  is  given  by  connecting  a  NHj-C“H| 
peak  to  the  NH|^i-C“H|  peak  to  the  peak  to  the  NHi^2’C“Hj+i  peak 

and  so  on.  However,  the  chain  of  such  sequential  connectivities  breaks  where  a 
proline  (no  NH)  is  present  in  the  sequence  or  where  more  than  two  peaks  are  in 
the  fingerprint  region  are  aligned.  Since  the  peaks  in  this  region  are  not  confined 
to  NHi-C“H|  and  C“H|-NHi^i  peaks,  many  NHj-C“Hj  peaks  (interresidue, 
nonsequential)  may  exist,  precluding  unambiguous  assignment  of  sequential 
connectivities.  Wuthrich  (1986)  suggested  that  basing  the  sequential 
connectivities  solely  upon  these  connections  would  likely  yield  only  about  half 
correct  results.  Ambiguities  were  resolved  and  fingerprint  region  connectivities 
were  verified  by  using  the  NH-NH  region  (6.5-1 1  ppm  in  both  dimensions)  of 
NOESY  spectra  obtained  at  different  pH  and  temperatures  (usually  samples  of 
different  pH;  here,  NH  shifts  are  greater  than  C“H  shifts). 

By  combining  the  sequential  connectivities  with  the  types  of  spin  systems 
of  the  side  chains,  connected  peaks  were  then  married  up  with  their  uniquely 
corresponding  segments  of  the  primary  sequence  in  order  to  make  the 
sequence-specific  assignments.  Conversion  of  the  Felix  database  arbitrary 
numerical  assignments  to  the  sequence  specific  assignments  was  done  by 
writing  out  the  entities  for  each  spectrum,  filtering  each  written  file  through  a 
UNIX  sed-based  script  conversion  file  (see  Appendix  B),  and  reading  these  new 
files  back  into  their  appropriate  entities. 
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NOE  Distance  Constraints.  A  simplistic  approach  to  deriving  distance 
constraints  from  NOE  data  is  to  integrate  the  volumes  of  the  peaks  at  a  given 
medium  mixing  time  (ca.  1 00  ms)  that  allows  for  peak  buildup  but  minimizes 
effects  of  spin  diffusion.  Volume  integrals  were  calibrated  to  a  peak  whose 
represented  distance  is  known  in  the  structure.  Wiithrich  (1986)  suggested 
analyzing  for  secondary  structural  features  first,  then  calibrating  from  the  known 
distances  in  such  structures.  The  derived  distance  would  serve  as  an  upper  limit 
distance  constraint.  The  lower  distance  constraint  would  be  set  to  the  sum  of  the 
van  der  Waals  radii  of  the  two  protons.  A  more  thorough  approach  calculates 
distance  constraints  from  the  buildup  of  the  peaks  from  several  spectra  acquired 
at  different  mixing  times.  However,  distance  constraints  must  still  be  calibrated 
to  a  known  structural  distance.  Felix  provides  an  automated  tool  for  doing  this 
type  of  determination.  The  NOESY  peak  boxes  that  have  been  picked  over  the 
peaks  of  a  NOESY  at  a  single  mixing  lime  are  sequentially  laid  over  the  matrices 
of  the  NOESY  spectra  at  different  mixing  times.  The  peak  volumes  are 
integrated  for  each  spectrum.  To  generate  a  table  of  calculated  upper  and  lower 
distance  constraints,  a  peak's  volume  integral  and  its  corresponding  distance  for 
calibration  were  supplied. 

One  initial  set  of  constraints  was  determined  by  Nieman  (unpub.)  by 
manual  segregation  of  peaks  into  weak,  medium  weak,  medium  and  strong 
based  on  their  integrated  volumes.  The  peaks  were  then  assigned  upper  limit 
distance  constraints  of  5,  4,  3.5  and  2.8  A,  respectively.  The  lower  distance 
constraints  were  set  at  3  A  for  weak  peaks  and  the  sum  of  the  van  der  Waals 
radii  for  all  other  peaks. 

A  separate  set  of  31 1  distance  constraints  was  determined  from  the  first 
group  of  myotoxin  a  spectra  to  be  completely  analyzed  in  Felix.  A  200  ms 


20 


TNNOESY  spectrum  of  3.4  mM  myotoxin  a  at  pH  3.5  in  85%  H20/15%  DjO  at 
25°C  provided  a  single  set  of  peaks  whose  volumes  were  integrated.  These 
volumes  were  calibrated  against  aromatic  protons  of  fixed  distance  and 
converted  into  upper  limit  distance  constraints  by  Felix. 

Multiple  NOESY  spectra  at  various  mixing  times  were  not  used  because 
they  were  taken  without  transmitter  presaturation  and  at  different  pH.  Therefore, 
these  spectra  at  multiple  mixing  times  had  lower  sensitivity  and  would  not 
accurately  overlay  the  TNNOESY  spectrum  due  to  shifted  peaks.  To  use  these 
spectra  for  determining  volume  buildup  rates  and  subsequent  distance 
constrainis  would  have  involved  attempting  to  accurately  follow  all  the  shift 
changes.  The  chosen  TNNOESY  aligned  with  a  60  ms  TNTOCSY  and  a 
TNDQCOSY  of  the  same  sample.  These  spectra  were  chosen  to  be  fully 
analyzed  under  Felix,  with  qualitative  rather  than  quantitative  augmentation  from 
other  spectra  of  other  similar  samples,  because  they  were  the  highest  quality 
spectra  of  myotoxin  a  to  date  and  provided  the  best  data  for  making 
assignments. 

The  set  of  31 1  distance  constraints  was  manually  screened  to  eliminate 
redundant  constraints.  Any  constraints  on  residues  Y1,  K2,  03,  H5,  and  K6 
were  also  eliminated  because  of  a  lower  confidence  in  their  assignments  (see 
results).  The  resultant  list  of  134  experimental  upper  limit  distance  constraints 
(77  intraresidue,  29  interresidue  sequential,  28  interresidue  nonsequential)  were 
rounded  up  to  the  next  whole  angstrom  and  manually  entered  into  the  required 
format  for  distance  geometry  calculations. 

Distance  Geometry  Calculations.  Initial  distance  geometry  calculations 
and  modeling  were  done  with  Dspace  4.0  (Hare  Research,  Inc.)  running  on  a 
Silicon  Graphics  Iris  4D/80GT  workstation.  Dspace  is  an  implementation  of  a 
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metric  matrix  distance  geometry  algorithm  (Crippen,  1977;  Wemmer,  1990),  a 
method  that  has  been  shown  to  successfully  determine  protein  structures  in 
solution  (Havel  &  Wuthrich,  1985;  Williamson  et  al.,  1985).  The  program,  as 
supplied  by  Hare  Research,  came  with  incomplete  tools  for  calculating  protein 
structures.  The  functions  provided  in  Dspace  allow  the  user  to  build  functioning 
macros  to  make  the  program  perform  an  appropriate  strategy  of  refinement. 
Non-functioning  sample  macros  were  included  with  the  program. 

Macros  were  written  and  revised  to  perform  an  effective  refinement 
strategy  (see  Figure  5).  The  primary  refinement  macro,  zipref.mac,  and  the 
other  macros  it  calls  are  included  in  Appendix  A.  This  macro  "zippers*  the 
protein  from  the  N-terminri  to  the  C-terminal  starting  with  the  refinement  of  the 
individual  residues,  repeating  with  a  two-residue  window,  and  repeating  over  and 
over  with  an  ever  increasing  window  size  until  the  entire  structure  is  being  refined 
at  once.  Dspace  calculates,  on  a  recurring  basis,  a  penalty  function  which  is 
essentially  a  weighted  sum  of  the  differences  between  the  allowable  and  the 
actual  interatomic  distances  and  angles  in  the  structure  at  the  time  of  the 
calculation.  The  lower  the  penalty  function,  the  better  the  structure  conforms  to 
all  the  constraints,  covalent  (bond  lengths  and  angles),  steric  (vdW  radii),  chiral 
(L-amino  acids)  and  experimental  (NOE-derived  distance  constraints).  The 
refinement  macro  institutes  simulated  annealing  when  the  refinement  process 
fails  to  bring  a  segment  to  within  a  given  penalty  value,  based  on  the  window 
size. 

Zipref.mac  has  many  desirable  features.  By  annealing  and  minimizing  the 
difference  between  allowable  and  actual  distances  with  single  residues  at  first, 
the  macro  emphasizes  correct  local  geometry,  as  very  few  experimental 
constraints  are  imposed  on  any  given  individual  residue.  Rigorous  annealing  in 
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four-dimensional  space  takes  place  in  the  beginning  of  the  refinement  process, 
regardless  of  penalty,  thus  forcing  Dspace  to  sample  more  conformational 
space.  Limited  conformational  space  sampling  has  been  a  noted  limitation  of 
this  program  (Metzler  et  al.,  1989).  The  annealing  and  minimization  of  the  whole 
molecule,  periodically  interspersed  with  the  zippered  refinement  with  an  ever 
increasing  window  size,  efficiently  balances  the  effects  of  the  local  and  global 
constraints.  The  macro  constantly  checks  and  corrects  incorrect  chirality  at 
chiral  centers  while  the  program  employs  floating  chirality  at  non- 
stereospecifically  assigned  prochiral  centers  (Weber  et  al.,  1988).  The  macro 
refines  both  the  original  embed  of  atoms  and  its  mirror  image,  generating  both  of 
a  potential  pair  of  “tertiary  enantiomers”  (structures  with  the  same  chirality  within 
the  primary  structure  but  with  essentially  mirror  image  backbone  folding).  This 
means  that  an  embed  that  leads  to  a  well  refined  structure  is  not  wasted  on 
generating  only  the  mirror  image  folding  of  the  correct  structure.  Finally, 
zipref.mac,  by  following  the  progress  of  the  penalty  function,  prevents 
acceptance  of  poor  structures  and  tries  to  better  the  best  structures. 

Zipref.mac  was  employed  with  the  original  (Nieman,  unpub.)  constraints 
and  versions  thereof  where  suspect  constraints  were  removed  or  modified. 

Additional  distance  geometry  calculations  were  performed  with  DIANA 
1.14  (Distance  Geometry  Algorithm  for  NMR  Applications)  (GOntert  et  al., 
1991a,b),  which  employs  a  variable  target  function  distance  geometry  algorithm 
(Wemmer,  1990),  a  method  also  fully  capable  of  determining  protein  structures 
in  solution  (Wagner  et  al.,  1987).  DIANA  was  compiled  under  FORTRAN-77  in 
UNICOS  and  run  on  a  Cray  X-MP  supercomputer.  DIANA  was  later  compiled 
under  FORTRAN-77  in  IRIX  and  run  on  a  Silicon  Graphics  Indigo  R3000 
computer.  DIANA  employs  pseudoatoms  to  accommodate  non-stereospecific 
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assignments  at  prochiral  centers  (Wiithrich  et  al.,  1983).  The  default  refinement 
strategy  was  used  on  both  the  original  (Nieman,  unpub.)  and  the  TNNOESY- 
derived  sets  of  distance  constraints  and  modifications  thereof. 

Visualization  and  Evaluation.  Structures  generated  in  Dspace  were 
visualized  within  Dspace  on  the  Silicon  Graphics  Iris  4D/80GT  workstation. 
Macros  employed  the  program's  abilities  to  superimpose  structures,  calculate 
root  mean  squared  deviations  (RMSDs),  and  display  selected  parts  of  the 
structure.  Dspace  has  the  ability  to  rotate  line  drawings  in  real  time.  RMSDs 
were  calculated  only  to  a  single  structure  on  which  the  others  were  then 
superimposed.  To  get  a  pairwise  listing  of  RMSDs,  a  group  of  structures  had  to 
be  repeatedly  superimposed  onto  each  member  of  that  group. 

Structures  generated  by  DIANA  were  in  the  form  of  an  atomic  coordinate 
file  that  required  format  conversion  to  be  imported  into  Quanta  (Polygen)  running 
on  the  Silicon  Graphics  Iris  4D/80GT  workstation.  The  UNIX  script  conversion 
file  appears  in  Appendix  B.  Quanta  can  superimpose  structures,  display 
selected  atoms,  and  overlay  a  backbone  tracing  ribbon.  Line  drawings  may  be 
manipulated  in  real  time,  including  in  stereoview.  DIANA  outputs  an  overview  file 
that  contains  a  matrix  of  all  painwise  RMSDs  for  structures  with  a  target  function 
below  a  chosen  cutoff  value.  The  overview  file  also  contains  a  listing  of 
repeatedly  violated  constraints  as  well  as  a  listing  of  possible  hydrogen  bonds. 

HPLC  Separations.  Reverse-phase  high  performance  liquid 
chromatography  (RP-HPLC)  was  performed  on  a  BioRad  HPLC  Model  1330, 
controlled  by  BioRad  software  running  on  an  Apple  lie  computer.  The  sample  of 
not  more  than  1  mg  of  myotoxin  a  in  water  was  injected  onto  a  Phenomenex 
Selectosil  5  C4  (250  x  10  mm;  5  micron)  semi-preparative  column.  Samples 
were  typically  eluted  with  a  gradient  of  22-23%  acetonitrile  (Baker)  in  triple 


25 


distilled  water  with  0.01  M  trifluoroacetic  acid  (Pierce)  (ca.  pH  2.0)  over  25 
minutes  at  a  flow  rate  of  3.0  ml/min.  Absorbance  at  220  nm  was  measured  with 
a  LDC  SpectroMonitor  III  flow  cell  and  recorded  and  analyzed  on  a  Spectra- 
Physics  SP4100  Computing  Integrator.  Fractions  were  collected  manually. 
Selected  fractions  were  dried  on  a  Speed  Vac  Concentrator.  Low  temperature 
separations  were  done  at  ca.  2°C  by  pre-chilling  the  solvents  overnight  in  a  cold 
room  and  keeping  all  solvents,  samples,  and  collected  fractions  in  ice  baths 
during  the  course  of  the  experiments. 

GCG  Analyses.  The  Genetics  Computer  Group  (GCG)  Sequence 
Analysis  Software  Package  was  run  on  a  VAX  6000-430  system  running  VMS 
5.5-2.  Peptide  sequence  homology  searching  was  performed  using  FastA  with 
the  SwissProt  database  of  20,024  sequences  dated  August  1992  using  a  word 
size  of  2.  A  TFastA  search  of  the  GenEMBL  database  of  48,274  sequences 
dated  September  1992  was  performed  with  a  word  size  of  2.  A  Motifs  search 
was  also  conducted.  PeptideStructure  was  used  to  perform  automated  Chou- 
Fasman  (Chou  &  Fasman,  1974)  and  Gamier-Osguthorpe-Robson  (Gamier  et 
al.,  1978)  secondary  structure  predictions.  The  output  of  PeptideStructure  was 
visualized  with  a  PlotStructure  one-dimensional  plot.  HelicalWheel  was  used  to 
look  at  the  alignment  of  side  chains  on  a  selected  region  of  myotoxin  a  that 
showed  the  potential  to  be  in  an  a-helix. 


CHAPTER  III 

RESULTS  AND  DISCUSSION 
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General  Quality  of  NMR  Spect'a.  The  TNCOSY,  TNTOCSY  and 
TNNOESY  experiments  produced  the  highest  quality  NMR  spectra,  especially 
close  to  the  water  resonance  line.  The  COSY  peaks  were  well  shaped  and 
easily  picked,  except  in  the  case  of  multiple,  overlapped  peaks.  The  fingerprint 
region  should  have  had  at  most  45  NH-C“H  peaks  if  all  five  glycine  residues' 
pairs  of  C<^H  peaks  were  distinguishable  and  if  all  40  amide  protons,  including 
the  N'terminal  protons,  were  visible  (42  residues  +  5G  -  3P  +  1  N-term.  =  45).  In 
reality,  no  NH-C“H  peak  was  found  for  Y1 ,  so  a  more  realistic  expectation  would 
be  to  see  not  more  than  43  NH-C“H  peaks  in  the  fingerprint  region.  At  the 
lowest  threshold  of  display,  clearly  57  such  peaks  appeared  in  this  region.  There 
were  many  instances  of  weak  peaks  appearing  very  close  to  or  partially 
overlapping  strong  peaks.  The  COSY  spectrum  became  unusable  due  to  severe 
overlap  only  in  the  upfield  region  occupied  by  CPH-OH  and  further  upfield 
crosspeaks  (0.5-3.5  ppm  in  both  dimensions).  The  high  quality  of  the  spectra 
allowed  peak  picking  and  brothering  in  both  dimensions  (on  both  sides  of  the 
diagonal).  Since  the  resolution  is  not  the  same  in  both  dimensions,  the  two  sides 
of  the  spectrum  were  not  exactly  symmetrical.  In  fact,  they  were  complementary 
to  each  other  in  that  the  splitting  pattern  was  usually  different  in  each  dimension, 
allowing  more  accurate  picking  of  the  center  of  peaks  and  the  deconvolution  of 
overlapping  peaks. 

The  TOCSY  spectra  were  relatively  similar  despite  using  a  range  of 
mixing  times  from  20  to  100  ms.  The  best  overall  spectrum,  at  a  mixing  time  of 
60  ms,  was  used  for  brothering  the  side  chain  connectivities.  The  spectra  of 
shorter  mixing  times  lacked  some  peaks,  while  the  1 00  ms  TOCSY  had  many 
peaks  which  were  highly  misshaped  and  overlapped.  The  100  ms  spectrum  did 
contain  a  few  more  peaks  than  the  60  ms,  and  these  were  added  to  the  spin 
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system  assignments.  The  80  ms  TOCSY  was  very  similar  to  the  60  ms  TOCSY 
and  probably  could  have  been  used  just  a^  effectively  for  makii  g  the  spin 
system  assignments.  All  of  the  TOCSY  spectra  suffered  to  some  extent  from 
misshapen  peaks  which  were  likely  due  to  the  slightly  offset  overlapping  peaks 
resulting  from  the  heterogeneity  of  the  sample. 

The  200  ms  TNNOESY  spectrum  was  superior  to  the  other  NOESY 
spectra  and  aligned  properly  with  the  TNTOCSY  and  TNCOSY  spectra.  In 
contrast  to  the  TOCSY  spectra,  where  spectra  from  Different  samples  at  different 
mixing  times  could  be  used  in  a  complementary  fashion,  the  various  mixing  time 
NOESY  spectra  from  different  samples  could  not  be  effectively  used  with  the  200 
ms  TNNOESY.  In  COSY  and  TOCSY  spectra,  when  peaks  shifted  slightly,  the 
peaks  could  be  easily  correlated  by  following  their  spin  systems.  In  the  NOESY 
spectra,  however,  there  were  many  unidentified  peaks  which  led  to  many 
ambiguous  connectivities.  Therefore,  to  follow  the  shifting  of  peaks  between 
NOESY  spectra  taken  under  slightly  different  conditions,  the  companion  shifts  in 
COSY  and  TOCSY  spectra  from  the  various  conditions  had  to  be  first  correlated. 
For  these  reasons,  the  200  ms  TNNOESY  was  used  as  the  primary  NOESY 
spectrum  for  making  interresidue  connectivities  ar.J  Jeriving  distance 
constraints.  The  NOESY  spectra  also  suffered  from  asymmetric  peaks,  which 
were  likely  due  to  the  heterogeneous  nature  of  the  sample. 

The  presence  of  chemical  microheterogeneity  and  apparent  isomerization 
exhibited  itself  in  all  spectra,  both  as  excess  peaks  with  distinctly  separate 
chemical  shifts  and  as  peaks  of  lesser  intensity  in  close  proximity  or  partially 
overlapping  stronger  peaks.  Excess  peaks  and  shadow  peaks  could  result  from 
different  causes,  since  extra  peaks  could  result  from  peptides  of  slightly  different 
composition  or  conformation.  If  chemical  microheterogeneity  and  isomerization 
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are  both  present  in  ca.  4:1  ratios,  the  spectra  will  consist  of  peaks  from  at  least 
three  different  peptide  forms.  The  least  populated  form,  the  minor  conformer  of 
the  minor  sequence,  would  exist  at  low  concentration  and  might  not  be  visible,  as 
would  other  minor  forms  which  may  be  present.  The  major  form,  the  primary 
conformer  of  myotoxin  a,  still  accounted  for  the  majority  of  peaks  and  could,  in 
most  cases,  be  unambiguously  assigned.  The  heterogeneity  of  the  sample  had 
its  most  troublesome  effect  on  interpretation  of  NOESY  spectra  where  weak 
peaks  could  not  be  distinguished  as  strong  NOEs  from  a  minor  form  or  weak 
NOEs  from  the  major  form. 

Spectra  obtained  from  different  samples  at  slightly  different  pH  and 
different  temperatures  proved  to  be  most  useful  in  sorting  out  ambiguities  in 
connectivities  and  in  finding  peaks  hidden  by  the  suppressed  water  resonance  in 
the  25°C  spectra.  Recording  spectra  at  25‘’C  at  slightly  different  pH  caused  the 
NH  peaks  to  shift  to  some  degree  while  most  of  the  C“H  peaks  did  mot  move 
significantly.  In  many  cases,  different  sets  of  peaks  aligned  in  different  spectra. 
Since  a  connectivity  must  align  in  ail  spectra,  the  deconvolution  of  ambiguities 
was  occasionally  reduced  to  a  process  of  elimination.  The  50°C  spectra  proved 
invaluable  in  finding  the  NH-C“H  peaks  of  D29  and  K35. 

Assignment  of  Spin  Systems.  The  first  spin  systems  identified  were  the 
glycines,  which  each  gave  2  NH-C“H  peaks  in  the  COSY.  The  glycine  peaks 
were  more  highly  split  than  other  peaks  and  displayed  connectivity  to  C“H-C“H 
peaks,  which  are  unique  to  glycine  residues.  The  remaining  spin  systems  were 
identified  by  the  complete  side  chain  connectivities  in  the  TOCSY  (see  Figure  6) 
and  the  complementary  pattern  in  the  COSY. 
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The  two  isoleucine  residues  were  initially  tentatively  identified  by  their 
unique  COSY  pattern,  though  not  all  such  peaks  were  resolved  (see  Figure  7). 
No  other  residue  with  a  single  C^H  occurs  in  myotoxin  a.  To  preclude 
misassignment  due  to  coincidental  C^H  methylene  shifts,  both  isoleucines' 
identities  were  also  corroborated  by  their  sequential  connectivities. 

Although  leucine  provides  a  unique  COSY  pattern,  L25  was  only  initially 
identifiable  as  a  long  chain  residue.  The  subsequent  connectivity  resolved  the 
identity  of  this  residue.  The  10  lysine  residues  were  identified  as  long  chain 
residues  and  later  confirmed  as  lysines  by  their  connectivities.  The  overlap  of 
peaks  *;  i  the  far  upfield  region  (0.5-3.5)  of  the  COSY  made  it  impossible  to 
completely  follow  the  unique  pattern  of  the  lysine  side  chain  all  the  way  through 
their  spin  systems.  The  three  other  residues  Wiithrich  (1986)  identifies  as 
having  unique  spin  systems  are  alanine,  valine,  and  threonine;  none  of  these  are 
present  in  myotoxin  a. 

AMX  spin  systems  were  identified  by  their  NH-C“H-2CPH  pattern  in  the 
COSY  and  the  lack  of  further  side  chain  connectivities  in  the  TOCSY.  This 
technique  would  not  preclude  misassignment  of  long  chain  residues  with  side 
chain  peaks  beyond  CPH  neither  visible  nor  resolved  in  the  spectra.  The  AMX 
spin  system  assignments  represented  1  tyrosine,  6  cysteine,  2  histidine,  1 
phenylalanine,  3  serine,  and  2  aspartate  residues.  Lacking  CrH  protons,  the 
aromatic  side  chains  could  be  connected  back  to  the  AMX  spin  systems  by  CPH- 
ring  proton  TOCSY  (H10,  W32,  and  W34)  or  NOESY  (FI  2)  peaks  with  the  same 
CPH  shifts  as  spin  systems  identified  as  AMX.  This  connectivity  was  lacking  for 
Y1  and  H5. 

Glutamate,  glutamine,  and  methionine  share  an  othenArise  unique  spin 
system  pattern.  These  3  residues  each  appear  once  in  myotoxin  a.  Q3  and 
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FIGURE  7:  117  side  chain  connectivities  in  a  DQFCOSY  of  myotoxin  a,  3.4mM, 
pH3.5  in  85%  H20/15%  DgO  at  25‘’C. 
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M28  were  initially  identified  as  long  chain  residues  and  later  uniquely  identified 
by  their  connectivities.  The  spin  system  of  El  5  was  identified  as  E/Q/M  in  the 
COSY  spectrum  by  the  characteristic  pattern  (see  Figure  8)  and  later  specifically 
identified  by  sequential  connectivities. 

Arginine  residues  exhibit  a  N^H  peak  that  lies  in  a  sparse  area  of  the 
spectrum  upfield  of  the  aromatic  ring  proton  peaks  and  downfield  of  peaks. 
R31  was  uniquely  identified  by  the  N^H-C^H  peaks  which  corresponded  to  C^H 
peaks  of  the  same  shift  in  a  long  chain  spin  system. 

Serine  residues  have  C^H  peaks  that  are  shifted  unusually  far  downfield 
into  the  C"H  region.  S22,  S23,  and  S41  were  identified  as  serine  residues  by 
being  the  only  3  AMX  residues  with  C^H  peaks  further  downfield  than  3.5  ppm 
(see  Figure  6).  Sequence  specific  assignments  were  made  by  their  sequential 
connectivities. 

A  tryptophan  residue  is  readily  identified  by  a  N^H  peak  that  lies  downfield 
of  10  ppm  and  forms  a  COSY  crosspeak  with  the  2h  (C®’H)  peak  (nomenclature 
per  Wuthrich,  1986)  at  ca.  7. 2-7 A  ppm.  These  crosspeaks  appear  in  an 
uncluttered  region  of  the  spectrum  and  lead  into  the  aromatic  region  where  the 
NOESY  N«H-7h  crosspeak  connects  to  the  remainder  of  the  ring  proton  peaks 
which  can  be  identified  by  their  unique  COSY  connectivity  pattern.  All  the  ring 
proton  peaks  of  W32  and  W34  were  readily  identified  in  this  manner  (see  Figure 
9),  with  their  sequence  specific  assignments  made  by  connectivity  back  to  AMX 
spin  systems  and  their  interresidue  connectivities. 

The  spin  systems  of  histidine  residues  were  identified  by  three  2h-4h 
crosspeaks  (see  Figure  9).  One  crosspeak  was  weak  compared  to  the  other  two 
and  was  determined  to  belong  to  some  minor  form  of  peptide  present.  Of  the 
two  strong  peaks,  only  H10  could  be  connected  back  to  an  AMX  spin  system  and 
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its  sequential  connectivities.  H5  was,  therefore,  assigned  by  default. 
Consequently,  the  assignment  of  AMX  peaks  for  H5  are,  admittedly,  less  sure. 

The  aromatic  spin  system  of  F12  was  readily  identified  by  its  characteristic 
COSY  pattern  (see  Figure  10).  The  corresponding  AMX  spin  system  was  initially 
identified  by  sequential  connectivities  and  later  corroborated  by  a  CPH-2,6h 
NOESY  crosspeak. 

The  spin  systems  of  the  three  proline  residues  were  identified  after  all 
other  spin  systems  were  brothered.  Since  proline  lacks  an  amide  proton,  the 
characteristic  C“H-CPH-OH-CSH  COSY  connectivities,  with  C^H  shifts  lying 
between  C“H  and  OH  shifts,  were  identified  in  the  upfield  half  of  the  spectrum 
(see  Figure  11)  without  corresponding  connectivities  to  a  NH  peak.  The  specific 
identification  of  PI  3  and  P21  resulted  from  sequential  C“Hi-NHj^i  NOESY 
crosspeaks  to  K14  and  S22,  respectively.  P20  was  assigned  as  the  remaining 
proline  spin  system. 

As  mentioned  earlier,  the  fingerprint  NH-C“H  peaks  for  D29  and  K35  were 
visible  in  only  the  50°C  spectra  (see  Figure  12)  because  of  their  proximity  to  the 
water  resonance  at  25‘’C. 

Wuthrich  (1986)  predicted  that  the  connectivities  soley  in  the  fingerprint 
region  of  a  NOESY  spectrum  would  result  in  correct  sequential  assignments  for 
about  one  half  of  the  residues.  For  this  reason  and  to  resolve  ambiguities,  a 
search  for  sequential  connectivities  was  carried  out  in  the  NH-NH  and  NH-C^H 
regions  of  the  NOESY  spectrum.  Figure  13  shows  several  such  NH-NH 
connectivities. 

The  spin  system,  aside  from  the  ring,  was  not  identified  for  Y1 ,  due  to  the 
highly  exchangeable  amino  protons  and  also  perhaps  due  to  a  highly  flexible  N- 
terminus.  The  assignments  of  residues  K2,  Q3,  H5,  and  K6  are  less  certain  than 
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FIGURE  10:  Aromatic  ring  proton  peaks  of  FI  2  and  Y1  in  a  DQFCOSY  of 
myotoxin  a,  3.4mM,  pH3.5  in  85%  H20/15%  D2O  at  25'’C. 
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FIGURE  1 1 :  P20  side  chain  connectivities  in  a  DQFCOSY  of  myotoxin  a, 
3.4mM,  pH3.5  in  85%  DgO  at  25“C. 
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FIGURE  12:  Intraresidue  and  sequential 
NOESY  spectrum  of  myotoxin  a,  3.7mM, 
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FIGURE  13:  NH-NH  region  of  NOESY  (200ms)  of  myotoxin  a,  3.4mM.  pH  3.5  in 
85%  H20/15%  DgO  at  25°C.  Sequential  NH-NH  connectivities  are  shown. 
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the  rest  of  the  assignments  because  they  were  determined  solely  by  fingerprint 
region  connectivities  and  spin  system  (long  chain  and  AMX).  G42  was  not 
identified  since  connectivity  could  not  be  established  from  S41  to  any  of  the 
remaining  3  pairs  of  glycine  peaks.  The  remaining  assignments  were 
corroborated  with  additional  connectivities  or  identification  of  unique  spin 
systems  or  both.  The  assignment  of  the  fingerprint  region  of  the  COSY 
spectrum  appears  in  Figure  14.  Many  unassigned,  excess,  unknown  peaks  are 
clearly  evident.  The  sequential  connectivities  for  K7-F12,  PI  3-11 9,  and  P21-S41 
are  shown  in  Figure  15.  The  resultant  sequence  specific  ^H-NMR  spin  system 
assignments  are  summarized  in  Table  1. 

Distance  Geometry.  The  original  constraints  (Nieman,  unpub.)  contained 
a  few  highly  questionable  assignments.  These  few  constraints  were  removed, 
and  Dspace  structures  were  generated.  This  process  was  continued  with 
modifications  to  the  Dspace  refinement  macros  and  minor  changes  to  the 
constraints,  such  as  constraining  only  one  of  pair  of  diastereotopic  protons  when 
only  one  such  volume  was  measured.  Eventually,  the  best  structures  from 
Dspace  were  created  with  this  file  of  170  constraints  (44  intraresidue,  74 
interresidue  sequential,  and  52  interresidue  nonsequential).  Ten  pairs  of 
structures  were  created,  each  structure  taking  ca.  15  hours  to  complete.  The 
four  most  well  refined  structures  superimposed  on  the  best  structure  gave  all 
atom  root  mean  square  distances  (RMSDs)  of  4.71 , 4.52,  and  5.02  A.  The 
backbones  of  these  superimposed  structures  are  visualized  in  Figure  16. 

Since  an  alternate  disulfide  bond  arrangement  had  been  published  for  the 
highly  homologous  myotoxin,  crotamine  (Conti  &  Laure,  1988),  structures  were 
created  in  Dspace  with  the  same  experimental  constraints  but  without  (tisulfide 
bonds.  Arbitrary  disulfide  bonds  were  selected  from  the  cysteine  side  chains  in 


FIGURE  15:  Fingerprint  region  (NH-C«H)  of  NOESY  of  myotoxin  a,  3.4mM,  pH 
3.5  in  85%  H20/15%  DgO  at  25°C.  Intraresidue  NH-C“H  peaks  are  labeled; 
sequential  inUrresidue  NH-C^H  peaks  are  not  labeled.  Residues  whose  labels 
appear  in  parentheses  are  visible  in  the  50°C  spectra. 


Table  I:  NMR  Chemical  Shifts  of  Myotoxin  a  at  pH  3.5,  25°C  (*50°C)  (ppr 

residue  NH  q(X|^  ^5)^  other  (and  unassigned) 

TyM  7.05(2.6h)  6.B0(3,5h) 

Lys-2  7.67  4.10  1.63  1.15 
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FIGURE  16:  Superimposition  of  four  of  the  best  Dspace  structures  (backbones 
only)  of  myotoxin  a  based  on  the  original  distance  constraints  (see  text). 
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closest  proximity  in  the  resultant  best  structure.  More  Dspace  structures  were 
generated  with  the  same  experimental  constraints  but  with  the  new  disulfide 
arrangement  of  4-18,  11-36,  and  30-37.  Surprisingly,  these  structures  were  both 
better  refined  and  had  lower  RMSDs.  The  best  four  superimposed  on  the  best 
with  all  atom  RMSDs  of  3.55,  3.34,  and  3.25  A.  This  result  led  to  a 
reinvestigation  of  the  disulfide  bond  arrangement  (Bieber  et  al.,  unpub.)  which 
confirmed  the  original  structure  of  Fox  et  al.  (1979).  None  of  the  Dspace 
structures  from  this  set  of  original  constraints,  whether  with  the  actual  or  the 
arbitrary  disulfide  arrangement,  showed  much  secondary  structure.  ey  looked 
like  three  loops  constrained  by  these  disulfide  bridges. 

A  possible  reason  for  the  lack  of  converged  structures  could  have  been 
incorrect  sequential  assignments.  The  sequential  asi  'gnments  described  in  the 
previous  section  were  independently  made  with  new  spectra  and  compared  with 
the  assignments  of  previous  investigators  (Henderson,  1986;  Murchison,  1989; 
Nieman,  unpub.).  The  results  were  in  near  complete  agreement  with  previous 
results;  the  major  exception  was  the  uncertainty  in  the  assignment  of  the  N- 
terminal  residues  mentioned  in  the  previous  section. 

DIANA  proved  to  be  a  much  more  useful  distance  geometry  program 
because  it  ran  much  faster,  filtered  out  meaningless  constraints,  and  provided 
better  feedback  which  allowed  one  to  locate  potentially  erroneous  constraints. 
The  original  constraints,  previously  used  with  Dspace,  were  put  in  DIANA  format 
and  run  as  upper  limits.  DIANA  put  out  a  modified  version  of  this  upper  limit  file 
after  it  had  deleted  irrelevant  constraints,  lengthened  constraints  that  were 
shorter  than  those  allow  by  the  covalent  geometry,  and  lengthened  constraints  to 
accommodate  the  use  of  pseudoatoms  for  non-stereospecifically  assigned 
prochiral  protons.  The  resulting  file  consisted  of  1 1 6  experimental  constraints 
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(eight  intraresidue,  59  interresidue  sequential,  and  49  interresidue 
nonsequential).  Running  on  the  Cray,  10  structures  were  created  in  ca.  1 1  min, 
consuming  2.33  min  of  CPU  time.  The  lowest  target  function  value  was  69.65, 
and  the  best  pair  of  superimposed  backbones  had  a  RMSD  of  4.10A. 

Several  iterations  of  removing  consistently  violated  constraints  and 
building  structures  led  to  a  final  version  of  the  original  constraints,  containing  88 
experimental  upper  limits  (8  intraresidue,  52  interresidue  sequential,  and  28 
interresidue  nonsequential)  (see  Appendix  C).  None  of  these  experimental 
constraints  involved  Y1 ,  K2,  Q3,  H5,  K6,  or  K7.  From  these  constraints,  536 
structures  were  created,  each  taking  an  average  of  just  under  3  min  clock  time  or 
1 .06  min  CPU  time,  on  the  Indigo.  The  best  structure  had  a  target  function  of 
0.71 ,  or  almost  2  orders  of  magnitude  better  than  the  first  version  from  original 
constraints.  This  result  seemed  to  indicate  that  many  erroneous  constraints  had 
been  weeded  out.  The  4  best  structures  superimposed  on  the  best  structure 
with  backbone  RMSDs  of  2.25,  2.80,  and  2.90  A. 

Using  the  confirmed  assignments,  200  ms  NOESY  peaks  were  integrated 
and  converted  to  distance  constraints  in  Felix.  These  constraints  were  manually 
converted  to  a  DIANA  upper  limit  file  of  1 34  experimental  constraints  (77 
intraresidue,  29  interresidue  sequential,  and  28  interresidue  nonsequential). 
Several  additional  interresidue  nonsequential  constraints  would  have  been 
attained  if  NHK7,  2hW32,  and  4hF12  had  not  had  so  many  ambiguous 
crosspeaks.  This  set  of  new  constraints  shared  only  36  constraints  (6 
intraresidue,  14  interresidue  sequential,  and  16  interresidue  nonsequential)  with 
the  original  set,  all  of  which  were  modified.  When  run  with  the  new  constraints, 
DIANA  put  out  a  modified  version  that  deleted  40  meaningless  constraints  and 
modified  49  for  non-stereospecific  assignments.  The  resultant  set  of  new 
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constraints  consisted  of  86  experimental  upper  limits  (31  intraresidue,  24 
interresidue  sequential,  and  31  interresidue  non-sequential)  (see  Appendix  C). 

These  new  constraints  were  run  in  DIANA  to  produce  999  structures.  The 
lowest  target  function  was  0.49.  The  4  best  structures  superimposed  on  the  best 
structure  with  backbone  RMSDs  of  2.52,  2.30,  and  2.41  A  (all  non-H  atom 
RMSDs  of  4.10,  3.79,  and  3.87  A).  When  the  best  structure  from  the  original 
constraints  and  the  second  and  third  best  structures  from  the  new  constraints 
are  superimposed,  they  show  a  consistent  backbone  folding  with  RMSDs  of  ca. 
2.5  A  (see  Figure  17a),  in  spite  of  sharing  only  42%  of  the  same  constraints. 
Figure  17b  shows  the  best  structure  from  the  original  constraints  and  the  best 
and  third  best  structures  from  the  new  constraints  superimposed.  In  both  cases, 
a  central  core  of  three  foldings  of  antiparallel  p-sheet  (see  Figure  18) 
constrained  by  3  disulfide  bridges  are  joined  by  loops  and  turns,  including  a 
modified  type  VI  turn  (Creighton,  1993)  in  residues  Cl 8,  119,  P20,  and  P21  (see 
Figure  19).  This  unusual  turn  was  identified  by  looking  for  and  finding  a  strong 
NOESY  crosspeak  between  C®HI19-C“HP20,  uniquely  indicative  of  a  c/s-proline 
peptide  bond  (see  Rawn,  1989,  for  stereoviews  of  cis  and  trans  proline  peptide 
bonds).  Neither  of  the  other  two  prolines  showed  such  a  peak.  It  is  possible 
that  a  cis-trans  isomerization  about  this  bond  could  be  the  source  of  two 
conformations  observed  as  interconverting  peaks  on  the  HPLC. 

Low  Temperature  HPLC  Separations.  Separation  of  1 .0  mg  of  myotoxin  a 
(10  mg/ml  in  water)  by  semipreparative  RP-HPLC  at  25®C  on  a  20-21% 
acetonitrile  gradient  yielded  two  large  peaks  (A  and  B)  in  a  4:1  ratio 


B 


A  +  B 


=  80.4%  (see  Figure  20a).  After  drying,  redissolving,  and  injecting  the 


a 


b 


FIGURE  17:  Stereoviews  of  three  superimposed  backbones  of  distance 
geometry  generated  structures  of  myotoxin  a.  See  text  for  details. 
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FIGURE  18:  Schematic  representation  of  antiparallel  p-sheet  in  myotoxin  a  with 
arrows  pointing  to  proton  pairs  which  give  rise  to  NOE  peaks. 
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B  fraction  of  this  and  product  from  two  similar  runs  (77.3%  and  83.1%  B), 
separation  yielded  79.4%  B  (Figure  2b).  500  pi  of  this  B  fraction  was  injected  71 
min  later  than  the  previous  run.  This  separation  yielded  83.4%  B  (Figure  20c), 
indicating  that  the  B  fraction  equilibrates  back  into  the  A  and  B  fractions  within 
about  an  hour  at  25°C. 

When  using  chilled  solvents  and  sample,  separation  of  1 .0  mg  of 
myotoxin  a  (10  mg/ml  in  water)  at  ca.  2“C  on  a  24-27%  acetonitrile  gradient 
yielded  70.7%  B  (Figure  20d).  When  this  B  fraction  was  held  on  ice  and  500  pi 
of  it  was  injected  206  min  later  than  the  previous  run,  the  separation  yielded 
95.7%  B  (Figure  20e).  This  result  Indicates  that  at  2°C  the  rate  of  reestablishing 
equilibrium  is  significantly  reduced,  maintaining  an  enrichment  of  over  95%  B  for 
more  than  3.4  hours. 

GCG  Analyses.  Peptide  sequence  homology  searching  using  FastA  in 
the  SwissProt  database  revealed  five  proteins  with  >80%  homology.  These  were 
all  rattlesnake  myotoxins.  A  gap  existed  in  the  homology  scoring  from  >80  to  60, 
confirming  the  uniqueness  of  this  family  (Bieber  et  al.,  1987).  The  other 
homologies  were  over  small  regions,  usually  either  in  the  first  or  second  half  of 
myotoxin  a's  sequence  but  rarely  in  the  middle  (see  Appendix  D).  Of  functional 
interest,  a  60%  homology  over  1 0  residues,  including  3  cysteines,  with  additional 
conservative  substitutions  exists  with  rat  brain  sodium  channel  protein  II  (see 
Figure  21 ).  A  TFastA  search  of  the  GenEMBL  database  revealed  that 
transcriptions  of  high  homology  or  obvious  functional  relevance  to  myotoxin  a 
were  not  present.  A  search  of  Motifs  identified  myotoxins  and  a  cAMP/cGMP- 
dependent  protein  kinase  phosphorylation  site. 
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FIGURE  21 :  Region  of  homology  between  myotoxin  a  and  rat  brain  sodium 
channel  protein  II.  Dashed  lines  indicate  conservative  substitution  per  FastA. 


PeptideStructure  provided  a  Chou-Fasman  prediction  of  a-helix  (weak 
helical  formers)  for  residues  2-7  and  12-19;  turns  at  9-10,  21-24,  and  38-41;  and 
p-sheet  (weak  p-sheet  formers)  at  31-37.  This  prediction  is  in  contrast  with  the 
Gamier-Osguthorpe-Robson  prediction  of  nearly  all  turns  except  for  a  helical 
stretch  from  28-35  (see  Figure  22a).  HelicalWheel  was  performed  on  the 
possible  helical  region1-12,  taking  into  account  the  predictions,  locations  of 
prolines,  and  the  distance  geometry  structures.  Figure  22b  shows  that  if  this 
region  were  an  a-helix,  a  hydrophobic  "greasy"  patch  would  exist  on  one  side  of 
the  helix  while  many  charged  residues  would  be  on  the  opposite  side. 
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FIGURE  22:  GCG  PlotStructure  secondary  structure  predictions  for  myotoxin  a 
(a,  top)  and  HelicalWheel  (b,  bottom)  for  residues  1-12. 


CHAPTER  IV 
CONCLUSIONS 
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Structural  Features.  The  distance  geometry  generated  structure  of 
myotoxin  a  is  not  highly  defined.  The  most  well  refined  structures  superimpose 
with  backbone  RMSDs  of  ca.  2.5  A,  a  rather  course  "resolution,"  especially  for 
such  a  small  protein.  Nevertheless,  the  fact  that  these  structures  were 
generated  from  two  sets  of  distance  constraints  which  were  derived  from 
different  spectra,  separately  assigned,  and  only  42%  alike  leads  to  a  high  degree 
of  confidence  in  the  accuracy  of  the  global  folding.  The  confirmed  sequence 
specific  assignments  likewise  seem  fairly  certain. 

The  lack  of  tighter  convergence  of  structures  comes  from  two  sources. 
First,  the  distance  constraints  used  are  a  first  approximation  from  NOESY  data  at 
a  single  mixing  time.  More  accurate  first  approximations  could  be  made  by 
assigning  a  set  of  DQCOSY,  TOCSY,  and  NOESY  spectra  taken  on  the  same 
sample  under  the  same  conditions,  with  several  mixing  times  for  NOESY 
spectra.  Using  the  tools  of  Felix,  the  actual  integration  and  subsequent 
derivation  of  distance  constraints  from  NOESY  spectra  that  differ  only  in  mixing 
times  is  a  straightforward  procedure. 

Even  with  multiple  mixing  times,  an  inherent  inaccuracy  exists  in  deriving 
distance  constraints  using  this  isolated  spin  pair  approximation,  where  the 
intensity  of  a  NOE  peak  is  assumed  as  Inversely  proportional  to  the  sixth  power 
of  the  distance  between  two  interacting  protons.  This  approximation  fails  to 
account  for  spin  diffusion,  the  transference  of  magnetism  to  other  nearby 
protons.  This  effect  can  significantly  alter  the  distance  constraints  derived  from 
an  isolated  spin  pair  approximation  for  mixing  times  as  short  as  50  ms  (Meadows 
et  al.,  1991). 

One  readily  available  way  to  refine  the  distance  constraints  to  a  greater 
level  of  accuracy  is  through  the  use  of  BKCALC  (Hare  Research)  (Nerdal  et  al.. 
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1989),  a  module  within  Dspace.  BKCALC  takes  a  structure  generated  from  a  set 
of  first  approximation  distance  constraints  and  back  calculates  its  NOESY 
spectrum,  taking  into  account  spin  diffusion.  The  calculated  spectrum  is 
compared  with  the  experimental  spectrum,  and  distance  constraints  are  adjusted 
accordingly.  More  structures  are  generated  from  these  refined  constraints,  and 
the  process  is  repeated  in  an  iterative  manner  until  the  calculated  spectrum 
matches  the  experimental  spectrum. 

The  second  source  of  limited  convergence  of  structures  is  the  relatively 
few  nonsequential  distance  constraints,  those  which  are  most  important  in 
defining  the  overall  folding  of  the  protein  (Wuthrich,  1986).  It  simply  would  not 
be  possible  to  highly  define  the  positions  of  all  side  chains  in  a  protein  this  size 
with  86  total  constraints,  31  of  which  are  nonsequential.  As  mentioned  earlier,  a 
few  ambiguities  prevented  the  complete  assignment  and,  hence,  use  of  several 
partially  assigned  NOE  peaks  and  their  resultant  distance  constraints.  As  is 
apparent  at  this  point,  the  excess  peaks  introduced  by  the  chemical 
microheterogeneity  and  isomerization  have  their  most  detrimental  effect  as 
distracters  causing  ambiguities  in  the  NOESY  spectra.  Distance  constraints 
should  be  used  from  only  unambiguously  assigned  peaks,  as  use  of  erroneous 
constraints  would  be  highly  counterproductive  to  distance  geometry  calculations 
and  waste  much  time. 

Chemical  Microheterogeneity.  The  slight  differences  between  the  forms 
of  C.  V.  viridis  myotoxins  make  it  difficult  to  separate  them  by  conventional 
preparative  biochemical  techniques,  such  as  ion  exchange  chromatography,  RP- 
HPLC,  gel  filtration,  or  precipitation  techniques.  Clearly,  the  greatest  hope  for 
purification  of  myotoxin  a  from  the  other  viridis  forms  lies  in  affinity 
chromatography.  A  synthesized  C-terminal  fragment  corresponding  to  viridis-3 
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could  be  attached  to  a  carrier  protein  and  used  for  the  production  of  polyclonal 
antibodies  specific  to  viridis-2,  3,  and  4  but  not  to  myotoxin  a.  It  is  not  certain 
whether  a  fragment  small  enough  to  elicit  such  a  specific  response  would  elicit 
much  of  a  response  in  the  first  place.  Conversely,  the  entire  myotoxin  a 
sequence  could  be  synthesized,  but  the  likely  success  of  properly  joining  the 
cysteines  and  folding  the  protein  is  also  uncertain. 

Isomers.  The  presence  of  two  isomers  (as  observed  by  HPLC)  which 
interconvert  slowly  on  the  NMR  timescale  leads  to  a  spectrum  that  is  the 
population  weighted  sum  of  the  resonances  from  the  two  isoforms,  rather  than 
their  average.  The  greatly  reduced  rate  of  interconversion  at  2°C  offers  hope  of 
keeping  a  sample  highly  enriched  in  one  form  long  enough  to  acquire  NMR 
spectra  (1D  or,  preferably,  2D).  The  exploration  of  solvent  systems  which  might 
greatly  shift  the  equilibrium  to  one  conformer  or  the  other  seems  worthy  of 
consideration  as  well. 

The  observation  of  a  c/s-proiine  peptide  bond  leads  to  consideration  of 
cis-trans  isomerization  as  a  prime  suspect  for  producing  the  two  observed 
conformers.  The  use  of  peptidyl  proline  isomerase  ((PPI)  (Harrison  &  Stein, 
1990)  in  kinetic  experiments  that  employ  HPLC  and  ID  ’H-NMR  methods  (Hsu 
et  al.,  1990)  to  measure  the  rates  of  interconversion  would  help  test  this 
hypothesis.  An  understanding  of  the  interconversion  mechanism,  however,  does 
not  seem  directly  relevant  to  solving  the  structure. 

It  is  not  possible  to  determine  the  extent  to  which  chemical 
microheterogeneity  and  isomerization  each  affect  the  spectra.  Therefore,  it  is 
not  possible  to  predict  how  much  the  ’H-NMR  spectra  would  improve  by 
producing  a  chemically  homogeneous  sample  or  by  maintaining  a  sample 
enriched  in  one  conformer.  Certainly,  combining  both  approaches  should  lead  to 
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^H-NMR  spectra  dominated  by  a  single  species  and  allow  a  fairly  quick  analysis 
of  the  spectra  and  solution  of  a  highly  defined  structure.  The  pursuit  of  both 
approaches  would  be  very  time  and  resource  consuming.  It  is  possible  that  a 
single  approach  may  remove  many  plaguing  peaks  from  ^H-NMR  spectra, 
leading  to  more  complete  assignments  and  enough  distance  constraints  to  build 
well  converged  and  refined  structures.  The  most  pragmatic  approach  would  be 
to  try  the  easiest  strategy  first:  low  temperature  isomer  enhancement  by  RP- 
HPLC.  The  use  of  three-dimensional  ’H-NMR  experiments  may  also  prove  to  be 
a  relatively  easy  way  to  resolve  spectral  ambiguities. 

Structure-Function.  The  combination  of  solved  structural  features  and 
secondary  structure  predictive  methods  along  with  the  pH  titration  work  of 
Henderson  (1986)  leaves  open  the  possibility  of  a  N-terminal  that  is  not  stable  at 
low  pH.  Such  a  helix  would  be  amphipathic. 

While  the  rat  brain  sodium  channel  protein  II  has  not  been  detected  in  rat 
skeletal  muscles  (Gordon  et  al.,  1987),  the  region  of  homology  with  myotoxin  a  is 
especially  unique  with  the  conserved  cysteines  so  critical  to  myotoxin  a's 
structure.  Though  this  region  is  only  1 0  residues  long,  it  encompasses  24%  of 
myotoxin  a's  sequence.  Future  homology  searches  in  this  area  seem  well 
warranted. 

While  it  is  purely  speculative  to  define  2  domains  in  myotoxin  a,  bipolar 
regions  of  homology,  Ca++  inhibition,  and  perhaps  structure  suggest  this  as  a 
viable  possibility.  In  such  a  model,  one  might  envision  myotoxin  a  with  a  N- 
terminal  a-helix  that  anchors  to  the  membrane  surface  while  a  region  near  the  C- 
terminus  with  charged  side  chains  pointing  away  from  aromatic  rings  along  a 
portion  of  anti-parallel  p-sheet  interacts  with  a  sodium  channel  protein  to  effect 
an  uncontrolled  influx  of  Na+  into  the  muscle  cell. 


Much  additional  investigation  is  needed  to  determine  how  myotoxin  a 
functions  and  how  that  function  relates  to  its  structure.  A  highly  defined  tertiary 
structure,  however,  should  be  solvable  within  the  near  future  through  improved 
purification  and  ’H-NMR  techniques. 
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APPENDIX  A 
□SPACE  MACROS 


gipft.aac 

c**zipref .mac 


68 


;  this  macro  will  sequentially  zip  through  your  structure 
;  to  fix  the  chirality,  minimize  and  anneal  in  4-D  (if 
;  necessary)  an  increasing  number  of  residues  until  the 
;  entire  structure  is  treated  as  one 

get/sym  nstr  "Number  of  structures  to  create; 
get/sym  refseq  "Sequence  to  use: 
get/sym  refbmx  "Bounds  matrix  to  use: 
get/sym  fname  "Filename  for  written  structures:  " 

set /log  on  %fname% 

rea/seq  %refseq% 
rea/bmx  %refbmx% 


set  pertinent  variables  different  from  defaults 


set  n_cy  16 
set  n_dim  4 
set  4d  1 


reasonable 

enables  4th  dimension 
enables  4D  refinement 


for  refrun  1  %nstr%  ;main  refinement  run  loop 
embed  ; embed  in  40  space 

wri  %fname%tmp  ; save  orig  embed  structure (40) 

fix/mirror  ; refine  mirror  image  of  embed  first 

•zipfix  ; fixes  indiv  res  ca  chiraity  w/  fix/mirror 
.newc_prochiral  ; swaps  around  most  bad  chiral  centers 
fix/c  ; fixes  any  other  bad  chiral  centers 

wri  %fname%%refrun%apr  .-save  pre-refinement  mirror  structure (40) 
call  zipit 

wri  %fname%%ref run%a  ;save  refined  mirror  structure (30) 

cle/x 

rea  %fname%tmp  ;get  orig  emJoed  structure (30) 

. zipfix 

. newc_prochiral 
fix/c 

wri  %fname%%refrun%bpr  ;save  pre- refinement  structure (40) 
call  zipit 

wri  %fname%%refrun%b  ;save  refined  structure (30) 
cle/x 

rea/seq  %refseq%  ;need  sequence  for  next  embed 

endf or 
end 


t 

zipit : 

for  zipbig  1  %n_res% 

■cgrna  %zipbig%  ( . 2+ . 05* (%zipbig%-l) )  , -sequential  refine 
endfor 

set  n_dim  3  ;goal  is  good  30  structure 

for  rezip  1  10  ;10  final  refinements  or  chiral  fixes 

min  ; final  refinements  if  chirality  good 

;  fix/c  >  cprob  ; echoes  auiy  chirality  problems  fixed 
;  if  %cprob%  .sne.  then  ;if  chirality  probs  were  found 

;  set  n_dim  4  ; reset  to  40 

;  set  4d  1 

;  . cgrna  1  . 3 


;zip  refine  individual  residues 
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;  . cgrna  %nres%  ( . 3+ . 05* (%n_res%-l) )  ;refine  whole  mol 

;  set  n_dim  3  ; return  to  3D 

;  endi f 
end for 
return 

zipfix.mac 

c**zipfix.mac 

;  this  macro  looks  at  each  residue,  in  order, 

;  calculates  its  signed  volume  for  the  ca  chiral 
;  center,  and  if  it  is  a  negative  value  (D)  changes 
;  it  to  positive  (L)  by  using  fix/mirror  on  only 
;  that  residue 

type  "Performing  ZipFix. . . 
type  "  “ 

for  hres  1  %n_res 
view/ residue  [%hres 
/  f$residue{l}  >  cares 
if  %cares%  .sne.  [gly  then 
;  .signvol  ha  n  cb  c 
set/sym  signvol  0 
set/sym  lha  f$find{ha} 
set/sym  In  f$find(n} 
set/sym  Icb  f$tind{cb] 
set/sym  Ic  f$find{c) 

/  v$get{%lha%, 4i} 

/  v$get{%ln%, 42} 

/  v$get{%lcb%, 43} 

/  v$get{%lc%, 44} 

/  v$sub{44 , 41 , 45}  v$sab{44,42,46}  v$sub{44 , 43 , 47} 

/  v$cross{46, 47, 48}  v$dot{45,48}  >  signvol 
ty  "  " 

ty  "signed  volume  of  %cares%  %hres%  is  "  %signvol 
ty  ”  " 

if  %signvol%  .It.  0  then 
fix/mirror 

type  "Fix/mirror  performed  on  "%cares%  %hres% 
type  "  " 
endif 
endif 
endfor 
view/all 
dra 
end 

COtMiMS 

c**cgrna .mac 

get/sym  nres  "Number  of  sequential  residues  per  min:  " 
get/sym  pval  "Acceptable  total  penalty  value: 
for  res  1  (%n_res%-%nres%+l) 

$ 

if  %nres%  .eq.  1  then 
set/sym  tres  %nres% 
call  ann 
endif 

t 

begin : 

set/sym  ptry  0 


call  min 
for  %ptry%  1  20 
if  %p_tot%  .gt.  %pval%  then 
call  arm 
call  min 
endif 
endf or 

if  %p_tot%  .gt.  %pval%  then 
call  bombout 
endif 

goto  drwnfix 
min : 

/  (%res%+%nres%-l)  >  tres 
type  "  " 

type  "Mimimizing  residues  %res%  to  %tres%“ 
fix/chiral 

if  %nres%  .It.  23  then 
for  bitmin  1  (%nres%+3) 

min  * [%res% : %tres%]  8.0  ! { • [%res% : %tres%] ) 
endfor 
else 

for  settmin  1  25 

min  * [ %res% : %tres%]  3.0  ! {• [%res%:%tres%] } 
endfor 
endif 
return 

arm: 

type  "  ” 

type  "Penalty  value  exceeded  —  annealing..." 
set  wt_4d  0.1 
set  shake  0 
for  4dhot  1  5 

anneal  * [%res% : %tres%] 
endfor 

for  4dcool  1  5 

set  wt_4d  (%wt_4d%+ . 18) 
anneal  * [%res% : %tres%] 
endfor 
return 

bombout : 
type  "  " 

type  “Could  not  achieve  acceptable  penalty  value  for  %res% : %tres% , 
type  %p_tot% 
type  "  " 
return 

drwnfix: 

dr 

fix/chiral 

t 

endfor 

end 


c *  * who 1 eanne . mac 
set/sym  trhs  %n_res% 


call  anne 
call  mini 
end 
/ 

anne ; 

type  "  " 

type  "Performing  whole  molecule  annealing 
set  wt_4d  0 . 1 
set  shake  0 
for  4dwarm  1  20 

anneal  *[l:%trhs%] 
endf or 

for  ghostmin  1  15 
min 
endf or 

for  4dchau  1  10 

anneal  *[l:%trhs%] 
endfor 

for  4dcold  1  5 

set  wt_4d  (%wt_4d%+ . 18) 
anneal  *[l:%trhs%] 
endfor 
return 

I 

mini : 

type  "  " 

type  “ . . .and  whole  molecule  minimization, 
fix/chiral 
for  pullmein  1  10 
min 
endfor 
return 

irisgt  47%  cat  minSO.mac 
c**more50 .mac 
/performs  min  50  times 
for  tightenup  1  50 
min 
endfor 
end 
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koaeonvrt  convert  DIANA  .cor  coordinate  output  files  to  Quanta  konnert 
input  format 

cat  $1  I  sed  -e  ' /Q/  d 
/DIANA/  d 
/atoms/  d 

s/xxx/yyy/g'  >  eel 

awk  '{printfC  %2d  %4s  %-4s  %8s  %8s  %8s%s “ , S3 , $4 , $2 , $5 , $6 , $7 , ORS) ) ' 

<eel  >ee2 

cat  ee2  |  sed  -e  's/ARG  HHl/ARG  HHll/g 

/ASP  HD2/  d 

S/ARG+/ARG  /g 

s/LYS+ZLYS  /g 

s /ASP- /ASP  /g 

S/HIS+/HIS  /g 

s/CYSS/CYS  /g 

s/HB2/HBl/g 

s/HB3/HB2/g 

s/HD2/HDl/g 

s/HD3/HD2/g 

s/HG2/HGl/g 

s/HG3/HG2/g 

s/HE2/HEl/g 

s/HE3/HE2/g 

s/HN/H  /g 

s/xxx/yyy/g'  >  ee3 

awk  '{printfC  %3s  %2d%-4s  %8s0  %8s0  %8s0 

0.00000%s",$2,$l,$3,$4,$5,$6,ORS)} '  <ee3  >$2 

aealon. filter  example  of  file  that  filters  Felix  2.05  written 

entities  replacing  arbitrary  numerical  assignments  with  sequence 
specific  assignments 

cat  $1  I  sed  -e  ' s/ca63/caK2/g 
s/cb63 /cbK2/g 
s/nh63 /nhK2/g 
s/cdPIII/cdP13/g 
s/cgPIII/cgP13/g 
s/ca45/caK14/g 
s/cb45/cbK14/g 
s/nh64/nhK16/g 
s/c?64/c?K16/g 
s/ca41/caI17 /g 
s/c?36/c?I19/g 
s/caPII/caP21/g 
s/caPI/caP20/g 
s/nh28/nhC30/g 
s/nh29/nhK39/g 
s/c?32/c?S41/g 
s/2hWII/2hW32/g 
s/7hWb/7hW34/g 
s/2hWI/2hW34/g 
s/neWI/neW34/g 
s/cbWI/cbW34/g '  >  taw 

awk  '{printfC"  %3s  %8s  %6s  %ls  %  6s  %8s 

%6s  %5s%s",$l,$2,$3,$4,$5,$6,$7,$8,$9,$10,ORS) ) '  <taw  >$2 
rm  taw 


%6s 


%s 
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M""!  last  version  of  original  constraints 

4  CYSS 


CB 

36 

CYSS 

SG 

3.10 

SG 

36 

CYSS 

CB 

3.10 

SG 

36 

CYSS 

SG 

2.10 

8 

GLY 

HN 

9 

GLY 

HN 

3.50 

9 

GLY 

HN 

10 

HIS+ 

HN 

5.00 

HN 

38 

LYS+ 

HA 

5.00 

QA 

38 

LYS+ 

HA 

4.38 

10 

HIS+ 

HN 

10 

HIS+ 

QB 

3.76 

HN 

11 

CYSS 

HN 

5.00 

HN 

12 

PHE 

HE2 

5.00 

HN 

36 

CYSS 

HA 

5.00 

HN 

37 

CYSS 

HN 

4.00 

HN 

38 

LYS+ 

HA 

5.00 

HN 

39 

LYS+ 

HA 

5.00 

HA 

11 

CYSS 

HN 

3.50 

QB 

11 

CYSS 

HN 

3.68 

QB 

12 

PHE 

HE2 

4 . 88 

11 

CYSS 

HN 

11 

CYSS 

QB 

3.76 

HN 

12 

PHE 

HN 

5.00 

HA 

12 

PHE 

HN 

2.80 

HA 

37 

CYSS 

HN 

5.00 

CB 

30 

CYSS 

SG 

3.10 

QB 

12 

PHE 

HN 

4.38 

QB 

13 

PRO 

QD 

6.75 

SG 

30 

CYSS 

CB 

3.10 

SG 

30 

CYSS 

SG 

2.10 

12 

PHE 

HN 

34 

TRP 

HA 

6.00 

HN 

35 

LYS+ 

HN 

3.50 

HN 

36 

CYSS 

HA 

5.00 

HN 

37 

CYSS 

HN 

5.00 

HE2 

37 

CYSS 

HN 

5.00 

13 

PRO 

HA 

34 

TRP 

HA 

5.00 

15 

GLU 

HN 

16 

LYS+ 

HN 

5.00 

HA 

16 

LYS+ 

HN 

2.80 

16 

LYS+ 

HN 

17 

ILE 

HN 

5.00 

HA 

17 

ILE 

HN 

2.80 

QB 

17 

ILE 

HN 

5.88 

17 

ILE 

HN 

18 

CYSS 

HN 

5.00 

HA 

18 

CYSS 

HN 

2.80 

18 

CYSS 

HN 

19 

ILE 

HN 

5.00 

HA 

19 

ILE 

HN 

2.80 

CB 

37 

CYSS 

SG 

3.10 

QB 

19 

ILE 

HN 

4.38 

QB 

24 

ASP 

QB 

5.75 

SG 

37 

CYSS 

CB 

3.10 

SG 

37 

CYSS 

SG 

2.10 

20  PRO 


76 


HA 

21 

PRO 

QD 

3.32 

21 

PRO 

HA 

24 

ASP 

QB 

5.88 

22 

SER 

HN 

22 

SER 

QB 

3.77 

HN 

23 

SER 

HN 

5.00 

HA 

23 

SER 

HN 

3.50 

QB 

23 

SER 

HN 

3.68 

23 

SER 

HN 

24 

ASP 

HN 

3.50 

HA 

38 

LYS+ 

HN 

5.00 

QB 

38 

LYSt- 

HN 

5.88 

24 

ASP 

HN 

25 

LEU 

HN 

5.00 

HA 

25 

LEU 

HN 

2.80 

HA 

26 

GLY 

HN 

5.00 

HA 

37 

CYSS 

HA 

4.00 

HA 

38 

LYS+ 

HN 

5.00 

QB 

25 

LEU 

HN 

3.68 

QB 

37 

CYSS 

QB 

4.75 

25 

LEU 

HN 

26 

GLY 

HN 

3.50 

HN 

26 

GLY 

QA 

5.88 

HN 

27 

LYS+ 

HN 

5.00 

HN 

37 

CYSS 

HA 

4.00 

HA 

26 

GLY 

HN 

3.50 

QB 

26 

GLY 

HN 

5.88 

26 

GLY 

HN 

27 

LYS+ 

HN 

6.00 

QA 

27 

LYS+ 

HN 

3.13 

27 

LYS+ 

HN 

27 

LYS+ 

QB 

3.77 

HN 

28 

MET 

HN 

5.00 

HA 

28 

MET 

HN 

3.50 

HA 

36 

CYSS 

HN 

5.00 

QB 

28 

MET 

HN 

3.68 

30 

CYSS 

HN 

30 

CYSS 

QB 

3.76 

HN 

31 

ARG+ 

HN 

5.00 

HA 

31 

ARG+ 

HN 

2.80 

QB 

31 

ARG+ 

HN 

4.38 

31 

ARG+ 

HA 

32 

TRP 

HN 

3.50 

32 

TRP 

HA 

34 

TRP 

HN 

5.00 

33 

LYS+ 

HA 

34 

TRP 

HN 

3.50 

QB 

34 

TRP 

HN 

5.88 

34 

TRP 

HN 

34 

TRP 

QB 

3.76 

HN 

35 

LYS+ 

HN 

5.00 

HA 

35 

LYS+ 

HN 

3.50 

35 

LYS+ 

HN 

36 

CYSS 

HN 

6.00 

36 

CYSS 

HN 

36 

CYSS 

QB 

3.76 

HN 

37 

CYSS 

HN 

5.00 

HA 

37 

CYSS 

HN 

2.80 

37  CYSS 


n 


HN 

37 

CYSS 

QB 

3.76 

HN 

38 

LYS+ 

HN 

5.00 

HA 

38 

LYS+ 

HN 

2.80 

QB 

38 

LYS  + 

HN 

4.38 

40 

GLY 

HN 

41 

SER 

HN 

5.00 

HN 

41 

SER 

HA 

6.00 

41 

SER 

HA 

42 

GLY 

HN 

3.50 

.upI  DIANA  modified  version  of  myoamodll . upl 

4  CYSS 


CB 

36 

CYSS 

SG 

3.10 

SG 

36 

CYSS 

CB 

3.10 

SG 

36 

CYSS 

SG 

2.10 

7 

LYS+ 

HN 

7 

LYS+ 

QB 

3.54 

8 

GLY 

HN 

9 

GLY 

HN 

3.00 

9 

GLY 

HAl 

38 

LYS+ 

HA 

4.00 

HA2 

38 

LYS+ 

HA 

4.00 

QA 

38 

LYS+ 

HA 

3.43 

10 

HIS+ 

HN 

12 

PHE 

HE2 

4.00 

HN 

37 

CYSS 

HN 

3.00 

QB 

11 

CYSS 

HN 

3.68 

QB 

12 

PHE 

HE2 

3.88 

11 

CYSS 

HA 

12 

PHE 

HN 

3.00 

HA 

36 

CYSS 

HA 

3.00 

HA 

37 

CYSS 

HN 

4.00 

CB 

30 

CYSS 

SG 

3.10 

QB 

12 

PHE 

HN 

5.88 

SG 

30 

CYSS 

CB 

3.10 

SG 

30 

CYSS 

SG 

2.10 

12 

PHE 

HN 

12 

PHB 

QB 

3.53 

HN 

35 

LYS+ 

HN 

3.00 

HN 

36 

CYSS 

HA 

5.00 

HE2 

37 

CYSS 

HN 

4.00 

HE2 

37 

CYSS 

HB2 

4.00 

HE2 

37 

CYSS 

HB3 

4.00 

HE2 

37 

CYSS 

QB 

3.43 

13 

PRO 

HA 

34 

TRP 

HA 

4.00 

14 

LYS+ 

HN 

15 

GLU 

HN 

3.00 

HN 

34 

TRP 

HA 

3.00 

HN 

35 

LYS+ 

HN 

4.00 

HA 

16 

LYS+ 

HN 

4.00 

15 

GLU 

HN 

15 

GLU 

QG 

4.88 

HN 

16 

LYS+ 

HN 

3.00 

16 

LYS+ 

HN 

16 

LYS+ 

QB 

3.54 

HA 

18 

CYSS 

HN 

3.00 

17 

ILE 

HN 

17 

ILE 

HB 

3.00 

I 


18  CYSS 


CB 

37 

CYSS 

SG 

3.10 

QB 

24 

ASP 

QB 

4.75 

SG 

37 

CYSS 

CB 

3.10 

SG 

37 

CYSS 

SG 

2.10 

19 

ILE 

HN 

19 

ILE 

HB 

3.00 

HA 

20 

PRO 

HA 

3.00 

21 

PRO 

HA 

24 

ASP 

QB 

4.88 

22 

SER 

HN 

22 

SER 

HB2 

4.00 

HN 

22 

SER 

HB3 

4.00 

HN 

22 

SER 

QB 

3.43 

HN 

23 

SER 

HN 

3.00 

23 

SER 

HN 

23 

SER 

HB2 

4.00 

HN 

23 

SER 

HB3 

4.00 

HN 

23 

SER 

QB 

3.43 

HN 

24 

ASP 

HN 

4.00 

QB 

38 

LYS+ 

HN 

4.88 

24 

ASP 

HN 

24 

ASP 

QB 

3.54 

HA 

26 

GLY 

HN 

4.00 

HA 

37 

CYSS 

HA 

3.00 

QB 

37 

CYSS 

QB 

4.75 

25 

LEU 

HN 

25 

LEU 

QB 

3.54 

HN 

26 

GLY 

HN 

3.00 

HN 

37 

CYSS 

HA 

3.00 

26 

GLY 

QA 

27 

LYS+ 

HN 

3.22 

27 

LYS+ 

HN 

28 

MET 

HN 

4.00 

HA 

27 

LYS+ 

QG 

3.58 

HA 

36 

CYSS 

HN 

4.00 

30 

CYSS 

HN 

30 

CYSS 

HB2 

4.00 

HN 

30 

CYSS 

HB3 

4.00 

HN 

30 

CYSS 

QB 

3.43 

HA 

31 

ARG+ 

HN 

3.00 

31 

ARG+ 

HN 

31 

ARG+ 

QB 

3.55 

HN 

34 

TRP 

QB 

3.88 

HB2 

34 

TRP 

HDl 

4.00 

HB3 

34 

TRP 

HDl 

4.00 

QB 

31 

ARG+ 

HE 

4.88 

QB 

32 

TRP 

HN 

3.88 

QB 

34 

TRP 

HDl 

3.43 

32 

TRP 

HN 

32 

TRP 

HE3 

5.00 

HN 

33 

LYS+ 

HN 

4.00 

HA 

33 

LYS+ 

HN 

3.00 

QB 

32 

TRP 

HDl 

3.38 

HE3 

33 

LYS+ 

HN 

4.00 

33 

LYS+ 

HN 

34 

TRP 

HN 

3.00 

QB 

34 

TRP 

HDl 

4.88 

34 

TRP 

79 


37 


38 


39 


CYSS 


LYS+ 


LYS+ 


HN 

34 

TRP 

HDl 

3.00 

HA 

35 

LYS+ 

HN 

3.00 

HB2 

35 

LYS+ 

HN 

4.00 

HB3 

35 

LYS+ 

HN 

4.00 

QB 

34 

TRP 

HDl 

3.38 

QB 

35 

LYS+ 

HN 

3.43 

HN 

37 

CYSS 

HB2 

4.00 

HN 

37 

CYSS 

HB3 

4.00 

HN 

37 

CYSS 

QB 

3.43 

QB 

38 

LYS+ 

HN 

4.88 

HN 

38 

LYS+ 

HB2 

4.00 

HN 

38 

LYS+ 

HB3 

4.00 

HN 

38 

LYS  + 

QB 

3.43 

HN 

39 

LYS+ 

QB 

3.54 

MwyoaS.ovw  (adltad)  structures  built  with  myoamodlO .upl  (last  version  of 
original  constraints) 

Overview: 


Number  of  accepted  structures 
Residue  range  for  upper  limits 
lower  limits 
van  der  Waals 
Cutoff  for  upper  limits 
lower  limits 
van  der  Waals 
angle  constraints 

CPU  time 

CPU  time  per  structure 
Average  number  of  iterations 


536  (536  structures  started) 
42 
42 
42 

0.20  A 
0.20  A 
0.10  A 
5.00  deg 
568.83  min 
1 . 06  min 
2460 


Struct 

target 

upper  limits 

lower  limits 

van 

der  Waals 

torsion  angles 

function 

« 

sum 

max 

* 

sum 

metx 

« 

sxim 

max 

« 

sum 

max 

1 

306 

0.71 

0 

0.8 

0.19 

0 

0.0 

0.03 

7 

2.9 

0.34 

0 

0.0 
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Constraint  violation  and  hydrogen  bond  overview  (structures  ordered) : 

Cutoff  for  target  function  ;  3.00E+00 

Number  of  structures  included  ;  44 

Number  of  violated  constraints  :  468 

Number  of  consistent  violations:  0 

Maximal  hydrogen  bond  length  :  2.40  A 

Maximal  hydrogen  bond  angle  :  35.00  deg 

Number  of  hydrogen  bonds  :  146 

Number  of  consistent  H-bonds  :  0 
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Pairwise  RMSDs  (structures  ordered): 
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MY0A9.0VW  (•difd)  structures  from  new  constraints  (myoamodl2 . upl ) 
Overview: 
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Cutoff  for  target  function 
Number  of  structures  included 
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Number  of  violated  constraints  :  62 

Niimber  of  consistent  violations;  0 

Maximal  hydrogen  bond  length  ;  2.40  A 

Maximal  hydrogen  bond  angle  ;  35.00  deg 

Number  of  hydrogen  bonds  ;  66 

Number  of  consistent  H-bonds  :  0 
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Pairwise  RMSDs  (structures  ordered) : 


Number  of  bac)cbone  atoms 
Number  of  heavy  atoms 
Residues  considered 
Local  RMSD  segment  length 


102 
273 
6.  .39 

3  residues 


Mean  global  backbone  RMSD 
Mean  global  heavy  atom  RMSD: 


2.83  +/-  0.53  A  (1.63. .4.49  A) 
4.37  +/-  0.53  A  (2.97. .5.70  A) 
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13.43 

appendix  d 

GCG  PASTA  RESULTS 


(Peptide)  PASTA  of:  Myoa.Seq  from:  1  to:  42  November  17,  1992  18:29 

Myotoxin  a  from  venom  of  Crotalus  viridis  viridis. 
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TO:  SwissProt:*  Sequences:  20,024  Symbols:  6,524,504  Word  Size:  2 


Score  Initl  Initn 


<  2 

201 

201 

=  = 

4 

18 

18 

6 

20 

20 

=  = 

8 

72 

72 

=  = 

10 

544 

544 

=  =: 

12 

1313 

1313 

=  = 

14 

200 

200 

=  s= 

16 

1934 

1934 

=  =: 

18 

3511 

3511 

=  ss 

20 

559 

559 

=  = 

22 

4358 

4358 

=  = 

24 

4298 

4298 

26 

447 

447 

=  = 

28 

1095 

1078 

=  = 

30 

562 

554 

=  = 

32 

199 

191 

=  = 

34 

249 

232 

=  = 

36 

160 

160 

s:  = 

38 

83 

88 

SIS 

40 

106 

103 

ss 

42 

31 

46 

ss 

44 

27 

37 

=% 

46 

25 

34 

ss 

48 

4 

8 

50 

1 

4 

=+ 

52 

0 

2 

54 

0 

2 

+ 

56 

1 

1 

= 

58 

0 

1 

+ 

60 

0 

2 

4- 

62 

0 

0 

64 

0 

0 

66 

0 

0 

68 

0 

0 

70 

0 

0 

72 

0 

0 

74 

0 

0 

76 

0 

0 

78 

0 

0 

80 

0 

0 

>  80 

6 

6 

== 

===♦+♦++++ 

=++++•»■ 

++++ 


mean  initn  score:  19.4  (4.89) 
mean  initl  score:  19.4  (4.89) 


The  best  scores  are: 


initl  initn  opt.. 


Sw : Myxl $Crovv 
Sw : Myxc$Crodu 
Sw : Myx2  $  Cr ovc 
Sw : Myxc$Crovh 
Sw:Myxl$Crovc 
Sw :  Myx2  $Cr  ow 
Sw:Bcr$Human 
Sw:Lil2$Caeel 
Sw :  Uro)<.$Mouse 
Sw:Rnp$Hacru 


P01476  prairie  rattlesnake  (crotalus  virid. . 
P01475  tropical  rattlesnake  (crotalus  duri . . 
P12029  midget  faded  rattlesnake  (crotalus  . . 
P01477  southern  pacific  rattlesnake  (crota. . 
P12028  midget  faded  rattlesnake  (crotalus  . . 
P19861  prairie  rattlesnake  (crotalus  virid. . 
P11274  human  (homo  sapiens) .  brealqpoint  clu. . 
P14585  caenorhabditis  elegans.  lin-12  prot.. 
P06869  mouse  (mus  musculus) .  urokinase-typ. . 
P00686  red  k^mgaroo  (macropus  rufus) .  ribon. . 


280 

280 

280 

271 

271 

271 

263 

263 

263 

254 

254 

254 

249 

249 

249 

249 

249 

249 

43 

59 

43 

48 

59 

53 

40 

58 

41 

55 

55 

55 

Sw:Uro'c$Pig  P04185  pig  (sus  scrofa) .  urokinase-type  plas . . 
Sw:Pol$Socmv  P15629  soybean  chlorotic  mottle  virus,  enzy. . 
Sw:Dpol$Bpt3  P20311  bacteriophage  t3 .  dna  polymerase  (ec.. 
Sw:Nxll$Denja  P01393  eastern  jameson's  mamba  (dendroaspi . . 
Sw: Alb2$Xenla  P14872  african  clawed  frog  (xenopus  laevis.. 
Sw:Cin2$Rat  P04775  rat  (rattus  norvegicus) .  sodium  chann. . 
Sw;Pol$Camvd  P03556  cauliflower  mosaic  virus  (strain  d/h. . 
Sw:Pol$Camvc  P03555  cauliflower  mosaic  virus  (strain  cml . . 
Sw:Limbs$Rat  P15800  rat  (rattus  norvegicus).  s-laminin  pr.. 
Sw:Vglm$Leev  P16853  lee  virus,  m  polyprotein  precursor  (.. 
Sw; Vglm$Ho jov  P16493  hojo  virus,  m  polyprotein  precursor.. 
Sw:Ompb$Chltr  P10553  chlamydia  trachomatis,  outer  membra.. 
Sw:  It^l$Human  P20701  hvutian  (homo  sapiens),  leukocyte  adh.. 
Sw:Nxl2$Denvi  P01395  western  green  mamba  (dendroaspis  vi . . 
Sw:Nxll$Denvi  P01394  western  green  mamba  (dendroaspis  vi . . 
Sw:Cina$Eleel  P02719  electric  eel  (electrophorus  electri.. 
Sw: Ibb2$Wheat  P09864  wheat  (triticum  aestivum) .  proteina. . 
Sw: Pa29$Pseau  P20253  mulga  snake  (pseudechis  australis)... 
Sw; Ibb2$Setit  P19860  foxtail  millet  (setaria  italica) .  m. . 
Sw: Pa2c$Pseau  P20256  mulga  snake  (pseudechis  australis)... 
Sw:42$Human  P16452  humzui  (homo  sapiens),  erythrocyte  mem.. 
Sw: Pa2a$Pseau  P20255  mulga  snake  (pseudechis  australis)... 
Sw:Alkl$Human  P03973  humsin  (homo  sapiens),  antileukoprot . . 
Sw:Gunb$Psefl  P18126  pseudomonas  fluorescens.  endoglucan. . 
Sw: Pa2b$Psepo  P20259  red-bellied  black  snake  (pseudechis.. 
Sw: Ibbl$Wheat  P09863  wheat  (triticum  aestiviun) .  proteina.. 
Sw: Pa2a$Psepo  P20258  red-bellied  black  snake  (pseudechis.. 
Sw: Ibbr$Orysa  P07084  rice  (oryza  sativa) .  bran  trypsin  i.. 
Sw;Coat$Socmv  P15627  soyjoean  chlorotic  mottle  virus,  coa. . 
Sw: Pa21$Pseau  P04056  mulga  snake  (pseudechis  australis)... 


36 

54 

39 

39 

53 

39 

37 

52 

37 

42 

51 

42 

36 

50 

36 

50 

50 

51 

42 

49 

42 

42 

49 

42 

48 

48 

48 

39 

48 

39 

39 

48 

39 

47 

47 

50 

40 

47 

48 

38 

47 

38 

38 

47 

38 

47 

47 

47 

46 

46 

51 

34 

46 

41 

46 

46 

47 

34 

46 

36 

46 

46 

52 

34 

46 

36 

46 

46 

52 

46 

46 

48 

34 

46 

37 

46 

46 

52 

34 

46 

36 

46 

46 

48 

39 

46 

40 

34 

46 

36 

Myoa . Seq 
Sw:Myxl$Crovv 

ID  MYX1$CR0W  STANDARD;  PRT;  42  AA. 

AC  P01476; 

DT  21-JUL-1986  (RED.  01,  CREATED) 

DT  21-JUL-1986  (REL.  01,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXIN  A  (MYOTOXIN  1) .  .  .  . 

SCORES  Initl:  280  Initn:  280  Opt:  280 

100.0%  identity  in  42  aa  overlap 

10  20  30  40 

Myoa. S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

IMIIIIMIMIIIIIIMIMIIMIIIIIIIIIIIIIII 

Myx 1 $  C  YKQCHKKGGHCFPKEKIC I PPSSDLGKMDCRWKWKCCKKGSG 
10  20  30  40 


Myoa . Seq 
Sw : Myxc$Crodu 

ID  MYXC$CRODU  STANDARD;  PRT;  42  AA. 

AC  P01475; 

DT  21-JUL-1986  (REL.  01,  CREATED) 

DT  21-JUL-1986  (REL.  01,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXIN  (CROTAMINE) .... 

SCORES  Initl:  271  Initn:  271  Opt:  271 

92.9%  identity  in  42  aa  overlap 
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10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  •-  M  M  I  :  I  I  I  I  I  I  I  :  I  I  I  I  j  I  I  I  I 

Myxc  $C  YKQCHKKGGHCFPKEKICLPPSSDFGKMDCRWRWKCCKKGSG 
10  20  30  40 


Myoa . Seq 
Sw : Myx2  $Crovc 

ID  MYX2$CROVC  STANDARD;  PRT;  43  AA. 

AC  P12029; 

DT  Ol-OCT-1989  (REL.  12,  CREATED) 

DT  Ol-OCT-1989  (REL.  12,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXIN  II.  .  .  . 

SCORES  Initl:  263  Initn:  263  Opt;  263 

92.7%  identity  in  41  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

M:|IMMIIMIIM|:MII|:|llllllllllillll 

Myx2$C  YKRCHKKGGHCFPKEKICTPPSSDFGKMDCRWKWKCCKKGSVN 
10  20  30  40 


Myoa . Seq 
Sw:Myxc$Crovh 

ID  MYXCSCROVH  STANDARD;  PRT;  43  AA. 

AC  P01477; 

DT  21-JUL-1986  (REL.  01,  CREATED) 

DT  21-JUL-1986  (REL.  01,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXIN  (TOXIC  PEPTIDE  C) .  .  .  . 

SCORES  Initl;  254  Initn;  254  Opt:  254 

87.8%  identity  in  41  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

Ihlllllllllll:  lhllM|:|lllllllllllllll 

Myxc$C  YKRCHKKGGHCFPKTVICLPPSSDFGKMDCRWKWKCCKKGSVN 
10  20  30  40 


Myoa . Seq 
Sw : Myxl $Crovc 

ID  MYXl$CROVC  STANDARD;  PRT;  43  AA. 

AC  P12028; 

DT  Ol-OCT-1989  (REL.  12,  CREATED) 

DT  Ol-OCT-1989  (REL.  12,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXIN  I.  .  .  . 

SCORES  Initl:  249  Initn:  249  Opt:  249 

85.4%  identity  in  41  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

I  h  I  I  I  h  I  I  I  I  I  I  :  I  I  •-  I  I  I  I  h  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I  I 

Myxl $C  YKRCHKKEGHCFPKTVICLPPSSDFGKMDCRWKWKCCKKGSVN 
10  20  30  40 
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Myoa . Seq 
Sw;Myx2$Crow 

ID  MYX2$CROW  STANDARD;  PRT;  45  AA. 

AC  P19861; 

DT  Ol-FEB-1991  (REL.  17,  CREATED) 

DT  Ol-FEB-1991  (REL.  17,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  MYOTOXINS  2  AND  3 .  .  .  . 

SCORES  Initl:  249  Initn:  249  Opt;  249 

85.4%  identity  in  41  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

IhIMhIMIIh  lhllll|:|lllllllllllllll 

Myx2  $C  YKRCHKKEGHCFPKTVICLPPSSDFGKMDCRWKWKCCKKGSVNNA 
10  20  30  40 


Myoa . Seq 
Sw:Bcr$Human 

ID  BCR$HUMAN  STANDARD;  PRT;  1271  AA. 

AC  P11274; 

DT  Ol-JUL-1989  (REL.  11,  CREATED) 

DT  Ol-JUL-1989  (REL.  11,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  BREAKPOINT  CLUSTER  REGION  PROTEIN  (GENE  NAME:  BCR).  .  .  . 

SCORES  Initl:  43  Initn;  59  Opt;  43 

66.7%  identity  in  9  aa  overlap 

10  20  30 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRW 

NIh  II 

Bcr$Hu  PTTYRMFRDKSRSPSQNSQQSFDSSSPPTPQCHKRHRHCPVWSEATIVGVRKTGQIWPN 
360  370  380  390  400  410 

40 

Myoa.S  KWKCCKKGSG 

Bcr$Hu  DGEGAFHGDADGSFGTPPGYGCAADRAEEQRRHQDGLPYIDDSPSSSPHLSSKGRGSRDA 
420  430  440  450  460  470 


Myoa . Seq 
Sw;Lil2$Caeel 

ID  LI12$CAEEL  STANDARD;  PRT;  1429  AA. 

AC  P14585; 

DT  Ol-JAN-1990  (REL.  13,  CREATED) 

DT  Ol-JAN-1990  (REL.  13,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  LIN-12  PROTEIN  PRECURSOR  (GENE  NAME:  LIN12).  .  .  . 

SCORES  Initl:  48  Initn:  59  Opt:  53 

30.8%  identity  in  26  aa  overlap 

10  20  30 

Myoa.S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKW 

:::|  |  |  :  :  :  :  |  :  :  ||| 

Li 1 2  $C  TEPITRESVNI IDPRHNRTVLHWIASNSSAEKSEDLIVHEAKECIAAGADVNAMDCDENT 
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1040  1050  1060  1070  1080  1090 

40 

Myoa.S  KCCKKGSG 

L i 1 2  $  C  PLMLAVLiARRRRLVAYLMKAGADPT lYNKSERSALHQAAANRDFGMMVYMLNSTKLKGDI 
1100  1110  1120  1130  1140  1150 


Myoa . Seq 
Sw : Urok$Mouse 

ID  UROK$MOUSE  STANDARD:  PRT;  433  AA. 

AC  P06869; 

DT  Ol-JAN-1988  (RED.  06,  CREATED) 

DT  Ol-JAN-1988  (RED.  06,  LAST  SEQUENCE  UPDATE) 

DT  01 -APR- 19 90  (REL.  14,  LAST  ANNOTATION  UPDATE) 

DE  UROKINASE-TYPE  PLASMINOGEN  ACTIVATOR  PRECURSOR  (EC  3.4.21.31)  (UPA) .  .  .  . 

SCORES  Initl:  40  Initn;  58  Opt:  41 

44.4%  identity  in  18  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

I  I  M  ■•■•II  --ll 

Uro)c$M  KNLKMSWKLVSHEQCMQPHYYGSEINYKMLCAADPEWKTDSCKGDSGGPLICNIEGRPT 
340  350  360  370  380  390 

Uro)c$M  LSGIVSWGRGCAEKNKPGVYTRVSHFLDWIQSHIGEEKGLAF 
400  410  420  430 


Myoa . Seq 
Sw:Rnp$Macru 

ID  RNP$MACRU  STANDARD;  PRT;  122  AA. 

AC  P00686; 

DT  21-JUL-1986  (REL.  01,  CREATED) 

DT  21-JUL-1986  (REL.  01,  LAST  SEQUENCE  UPDATE) 

DT  Ol-MAR-1989  (REL.  10,  LAST  ANNOTATION  UPDATE) 

DE  RIBONUCLEASE  PANCREATIC  (EC  3.1.27.5)  (RNASE  1)  (RNASE  A) .  .  . 

SCORES  Initl:  55  Initn:  55  Opt:  55 

28.6%  identity  in  28  aa  overlap 

10  20  30 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWK 

l|:::  I  =  I  ::|  h  Ul 

Rnp  $Ma  LMMKARDMTSGRCKPLNTF IHEPKSWDAVCHQENVTCKNGRTNCYKSNSRLS ITNCRQT 
30  40  50  60  70  80 

40 

Myoa.S  WKCCKKGSG 

Rnp$Ma  GASKYPNCQYETSNLNKQIIVACEGQYVPVHFDAYV 
90  100  110  120 


Myoa . Seq 
Sw:Urok$Pig 

ID  UROK$PIG  STANDARD;  PRT;  442  AA. 

AC  P04185; 

DT  20-MAR-1987  (REL.  04,  CREATED) 

DT  13 -AUG- 19 87  (REL.  05,  LAST  SEQUENCE  UPDATE) 


DT  Ol-APR-1990  (REL.  14,  LAST  ANNOTATION  UPDATE) 

DE  UROKINASE-TYPE  PLASMINOGEN  ACTIVATOR  PRECURSOR  (EC  3.4.21.31)  (UPA) . 

SCORES  Initl:  36  Initn;  54  Opt;  39 

38.9%  identity  in  18  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

1  I  :ll  --11 

Uro)c$  P  EQLKMTWKLVSHRECQQPHYYGSEVTTKMLCAADPQWKTDSCQGDSGGPLVCSTQGRLT 
350  360  370  380  390  400 

Uro)c$ P  LTGIVSWGRECAMKDKPGVYTRVSRFLTWIHTHVGGENGLAH 
410  420  430  440 


Myoa . Seq 
Sw: PolSSocmv 

ID  POL$SOCMV  STANDARD;  PRT;  741  AA. 

AC  P15629; 

DT  Ol-APR-1990  (REL.  14,  CREATED) 

DT  01 -APR- 1990  (REL.  14,  LAST  SEQUENCE  UPDATE) 

DT  01 -APR-1990  (REL.  14,  LAST  ANNOTATION  UPDATE) 

DE  ENZYMATIC  POLYPROTEIN  (CONTAINS:  ASPARTIC  PROTEASE  (EC  3.4.23.-), 

SCORES  Initl:  39  Initn;  53  Opt:  39 

100.0%  identity  in  3  aa  overlap 

10  20  30  40 

Myoa . S  KQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

III 

Pol$So  CINYIAPEGFFRTLALERKHLQKKISVKNPWKWDTIDTKMVQSIKGKIQSLPKLYNASIQ 
450  460  470  480  490  500 

PolSSo  DFLIVETDASQHSWSGCLRALPRESKKSDSMNSGYRPCDLCTGSSSASSDNSPAEIDKCH 
510  520  530  540  550  560 


Myoa . Seq 
Sw.  Dpol$Bpt3 

ID  DPOL$BPT3  STANDARD;  PRT;  704  AA. 

AC  P20311; 

DT  Ol-FEB-1991  (REL.  17,  CREATED) 

DT  Ol-FEB-1991  (REL.  17,  LAST  SEQUENCE  UPDATE) 

DT  Ol-FEB-1991  (REL.  17,  LAST  ANNOTATION  UPDATE) 

DE  DNA  POLYMERASE  (EC  2. 7. 7. 7)  (GENE  NAME:  5).  .  .  . 

SCORES  Initl:  37  Initn:  52  Opt:  37 

62.5%  identity  in  8  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICI PPSSDLGKMDCRWKWKCCKKGSG 

III:  :|l 

Dpol$B  EIAKTVIEVAQEAMRWVGEHWNFRCLLDTEGKMGANWKECH 
670  680  690  700 


Myoa . Seq 
Sw:Nxll$Denja 

ID  NXL1$DENJA  STANDARD;  PRT;  72  AA. 

AC  P01393; 

DT  21-JUL-1986  (REL.  01,  CREATED) 
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DT  21-JUL-\986  (rEL.  01,  LAST  SEQUENCE  UPDATE) 

DT  01 -APR- 1988  (REL.  07,  LAST  ANNOTATION  UPDATE) 

DE  LONG  NEUROTOXIN  1.  .  .  . 

SCORES  Initl:  42  Initn:  51  Opt;  42 

38.9%  identity  in  18  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

II  =  :h  I  :  |:|| 

NxllSD  RTCYKTYSDKSKTCPRGEDICYTKTWCDGFCSQRGKRVELGCAATCPKVKTGVEIKCCST 
10  20  30  40  50  60 


Myoa . Seq 
Sw:Alb2$Xenla 

ID  ALB2$XENLA  STANDARD;  PRT:  607  AA. 

AC  P14872; 

DT  Ol-APR-1990  (REL.  14,  CREATED) 

DT  Ol-NCV-1990  (REL.  16,  LAST  SEQUENCE  UPDATE) 

DT  01 -NOV- 19 90  (REL.  16,  LAST  ANNOTATION  UPDATE) 

DE  74  KD  SERUM  ALBUMIN  PRECURSOR.  .  .  . 

SCORES  Initl:  36  Initn:  50  Opt:  36 

60.0%  identity  in  10  aa  overlap 

10  20  30 

Myoa.s  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCR 

Ihll  ::|| 

Alb2  $X  EHPDDLLSAFIHEEARNHPDLYPPAVLALTKQYHKLAEHCCEEEDKEKCFSEKMKQLMKQ 
160  170  180  190  200  210 

40 

Myoa.s  WKWKCCKKGSG 

Alb2 $X  SHSIEDKQHHFCWILDNFPEKVLKALNLARVSHRYPKAEFKLAHNFTEEVTHFIKDCCHD 
220  230  240  250  260  270 


Myoa . Seq 
Sw:Cin2$Rat 

ID  CIN2$RAT  STANDARD;  PRT;  2005  AA. 

AC  P04775; 

DT  13-AUG-1987  (REL.  05,  CREATED) 

DT  13 -AUG- 1987  (REL.  05,  LAST  SEQUENCE  UPDATE) 

DT  Ol-APR-1990  (REL.  14,  LAST  ANNOTATION  UPDATE) 

DE  SODIUM  CHANNEL  PROTEIN  II,  BRAIN.  .  .  . 

SCORES  Initl:  50  Initn:  50  Opt:  51 

60.0%  identity  in  10  aa  overlap 

10  20  30  40 

Myoa . S  YKQCHKKGGHCFPKEKICIPPSSDLGKMDCRWKWKCCKKGSG 

II  :hllh 

Cin2$R  stvdigapaegeqpeaepeeslepeacftedcvrkfkccq: sieegkgklwwnlrktcyk 
1150  1160  1170  1180  1190  1200 

Cin2 $R  ivehnwfetfivfmillssgalafediyieqrktiktmleyadkvftyifilemllkwva 

1210  1220  1230  1240  1250  1260 


